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PREFACE. 



This book is based on lectures given at the London 
School of Economics and Political Science in the five 
years following its foundation in 1895. There seems to 
be no text-book in English dealing directly and com- 
pletely with the common methods of statistics. English 
writings on the various branches of the science are for 
the most part in the form of articles in the journals of 
learned societies. Professor Mayo Smith in his Statistics 
and Sociology proceeds almost at once to historical 
applications ; while in Professor Meitzen's Geschichte, 
Theorie, und Technik der Statistik^ issued in English 
by the American Academy of Political and Social 
Science, so much space is devoted to the history of 
the development of statistics, and the book is so slight, 
in comparison with the wide field it covers, that many 
elementary methods are treated very cursorily. In the 
excellent books in French, German, and Italian on this 
subject there is a general tendency to deal at length 
with the history of official statistics, the limits of the 
science, and particular applications of the theory of pro- 
bability, to the exclusion of more general matter; so that 
a student must refer to the works of Dr Mayr, Professor 
Westergaard, Professor Lexis, Professor Gabaglio, M. 
Block, and Dr Bertillon before he is completely ac- 
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VI PREFACE. 

quainted with the elementary methods of statistics. The 
result is that there is no compact statement of- principles 
acknowledged by statisticians, of the methods common to 
most branches of statistical work, of the artifices developed 
for handling and simplifying the raw material, and of the 
mathematical theorems by the use of which the results of 
investigations may be interpreted. This book forms an 
attempt to supply this want, so far as can be done without 
undue length. No place has been given in it to the 
history of statistics, and it does not contain any summary 
of the main groups of statistics extant ; several tables, 
drawn from a wide range of subjects, are given, but only 
to illustrate particular methods, and their choice has been 
determined by their suitability for this purpose. In the 
chapter on Collection of Material some account is ofiFered 
of the genesis of the most important English statistics : 
the great part of the figures tabulated in the Statistical 
Abstract can be traced back to the householder s schedule 
of the Population Census or the custom house returns of 
foreign trade, while the chief statistics accessible for the 
study of modern social questions have come from the 
Wage Census of 1886 or are collected by the Labour 
Department : it is hoped that the account of these four 
groups of figures will afford some help in judging of 
their accuracy and limitations. Considerable space has 
been allotted to the subjects of Averages and Diagrams, 
because their use is universal, and, while their principles 
and technique are simple, their application is often mis- 
understood. The chapter on Accuracy is based on the 
Newmarch Lectures of 1897, and may perhaps be found 
to contain something that is new, a claim which is not 
made for the great part of the book. The treatment 
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throughout Is intended to be suitable for those whose 
mathematfcs have not been carried to any height or have 
become rusty from disuse. With this view, when mathe- 
matical symbols were unavoidable, the preliminary 
hypotheses have been first discussed without algebraic 
notation and at some length, and those proofs have been 
chosen which require the minimum mathematical know- 
ledge rather than those which lead most directly to the 
result. Thus the most important results of the Theory 
of Error have been obtained without the use of the Differ- 
ential or Integral Calculus, and it is hoped that the greater 
part even of the chapter on Correlation will be intelligible 
to those who are not so well equipped as the Major- 
General in the Pirates of Penzance. Part II. is in- 
tended to be introductory and is certainly incomplete ; the 
normal law of frequency is the only one discussed, and 
the correlation of three variables is untouched. The more 
advanced treatment of this part of the subject is likely 
to be of interest to but few, who will have little trouble 
in obtaining the books and journals in which the further 
development may be found. Short bibliographies are 
added to the chapter on Interpolation and to Part II. for 
this purpose. It is hoped that this elementary handling 
may be of use to some who are interested in the statistical 
arguments based on the Laws of Probability, and that 
the definitions, formulae, and proofs given may save others 
from the necessity of searching in books, long out of 
print, for elementary theorems and deductions. The 
treatment in Part II., Section II., is peculiar in that it 
leaves very much in the background the Method of Least 
Squares ; the phrase, useful in some connections, seems 
to make the application of the Law of Error to statistics 
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unnecessarily complex. I am much indebted to Professor 
Edgeworth, who has not only given me continued help 
both privately and by his publications in the study of the 
mathematical treatment of statistics, but has also read 
Part II. in proof and suggested many useful and important 
alterations. My thanks are also due to Professor Everett 
and Mr W. F. Sheppard for help in the chapter on 
Interpolation, and to Mr C. P. Sanger and Mr H. Clissold 
for reading great parts of the book in proof. 

A. L. B. 



London School of Economics, 
Tanuary 1901. 
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CHAPTER I. 

SCOPE AND MEANING OF STATISTICS. 

Very many definitions have been given of the word statistics^ 

and each author who has written on the subject has assigned new 

Deflnttions of Hmits to the field which should be included in its 

statistics, scope. It will not be necessary for the purpose of 
this book to discuss the merely verbal differences involved, but 
only to explain what is intended by its title, and to consider 
the limits of the science which it is proposed to investigate. It 
will be useful, however, to mention some possible definitions. 

Statistics may, for instance, be called the science of counting. 

Counting appears at first sight to be a very simple operation, 

The Botenoe of which any one can perform or which can be done 

oountiag. automatically ; but, as a matter of fact, when we 
come to large numbers, ^.^., the population of the United King- 
dom, counting is by no means easy, or within the power of an 
individual ; limits of time and place alone prevent it being so 
carried out, and in no way can absolute accuracy be obtained 
when the numbers surpass certain limits. Great numbers are 
not counted correctly to a unit, they are estimated ; and we might 

Distinouoii perhaps point to this as a division between arith- 
betweoB statistics metic and statistics, that whereas arithmetic attains 
and arithxnouo. exactness, statistics deals with estimates, some- 
times very accurate, and often sufficiently so for their purpose, 
but never mathematically exact. Statistics generally relate to 
numbers so great that their estimation is beyond the power of 

statistics ^" individual, and requires the co-operation of an 
as oo-operativo organised body of workers. Though the collec- 

oonntiiig. ^j^^ ^j. fjujjjijgj.s by several persons and the mere 

addition of the results seem simply questions of arithmetic, yet 
in practice two difficulties soon occur. First, it is not easy to 
define the thing to be counted so explicitly that all the tellers 
shall admit and reject instances on the same principles ; for such 
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4 ELEMENTS OF STATISTICS. 

simple objects as the number of rooms or stories of a house, a 
person's age, even an individual, give rise to such complex ques- 
tions of definition that it is often impossible to tell from a short 
description of a category exactly what items are included in it. 
Secondly, numerical errors cannot be avoided when many 
workers are involved ; for some among a large number of 
persons will be inaccurate, some unintelligent, some will not 
obtain complete information, and when their reports are com- 
piled there will be occasional mistakes in copying and errors in 
tabulation. A total which is the result of the work of many 
hands will certainly from one cause or another fall short of 
complete accuracy. But though all estimates of this nature are 
sometimes included under the term statistics, this definition at 
once is too wide, and also does not bring out the distinctive 
nature of statistical method. 

It is better, in fact, to define statistics a posteriori. In dealing 

with masses of figures, large numbers descriptive of groups, series 

scatigtioB M a of totals or averages relating to different dates or 

method. places, it is found that special methods become 
necessary — methods which depend on particular properties of 
large numbers, methods which are suitable for describing com- 
plex groups so that they can be easily comprehended, methods 
for analysing the accuracy of statements, for measuring the 
significance of differences, for comparing one estimate with 
another. Those estimates to which these methods apply are 
within the scope of statistics ; it is the study of these methods 
that is the object of this book. It is clear that, under our 
tentative definition, statistics is not merely a branch of political 
oeneraiityof economy, nor is it confined to any one science. A 
lutiftioai knowledge of statistics is like a knowledge of 

method. foreign languages or of algebra : it may prove of 
use at any time under any circumstances. 

It may be interesting to trace the connection of statistical 
method with various branches of knowledge. To begin 
itnueinthe with the physical sciences: there are two points 
phyricaiioienoei. in which this method touches astronomy. The 
method of' least squares was introduced by an astronomer, 
anxious to choose the best of several slightly discrepant observa- 
tions of the position of a star. In most physical observations 
several measurements are taken of the same quantity, and it is 
found that, however carefully they are made, they never absolutely 
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SCOPE AND MEANING OF STATISTICS. 5 

agree; just as the averages obtained by different statisticians 
from the same series of sociological observations are generally 
not identical. From such a group of measurements it is neces- 
sary to deduce the most probable estimates ; this is done by the 
application of the law of error, known as the method of least 
squares. 

The other point of resemblance of statistical to astronomical 

method is common also to geology and to most applied sciences. 

Progreisive The course of scientific measurement has generally 

***™'**^- been to take first a rough observation of a quantity, 
such as the distance of the sun, the thickness of a stratum, the 
atomic weight of an element, the specific gravity of a substance ; 
then, as information accumulated, as the precision of instruments 
increased and methods were better adapted, to make the measure- 
ment gradually more and more accurate. It is important 
to appreciate this^ development, for in the present state of our 
knowledge, many statistical measurements cannot be made with 
precision for want of data, and a critic is inclined to say that for 
this reason preliminary estimates are valueless ; but from the 
scientific point of view this criticism is wrong, for a faulty 
measurement made on logical principles is better than none, 
and may lead to others with progressive improvement. 

Passing by the general resemblance of statistical investigations 

to all scientific experiments, we may notice the use of statistics 

statisuosand in biology. It was, perhaps, not recognised before 

woiogy. the publication of Professor Karl Pearson's inves- 
tigations,* that the whole doctrine of evolution and heredity 
rests in reality on a statistical basis. It is in this direction that 
the most important new work in statistics is being done. It may 
be worth while to sketch very briefly the nature of the problem. 
Out of a great number of observations, say the measurements of 
the heights of a group of men, the type is found — the average, 
about which all the measurements are grouped according to some 
definite law. The problem is then to determine whether this 
type or the grouping about it changes, and in what way. The 
differences found in successive generations form the data on 
which arguments as to evolution and development are founded. 
The method applies equally to fossil remains, to zoological 
species, and to many other groups. If it is neglected, many 

* See The Grammar of Scieme^ chap. x. seq,^ and the references there given. 
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6 ELEMENTS OF STATISTICS. 

valid arguments lose a great part of their force, and theories 
are founded on personal impressions of phenomena instead 
of on scientific measurement. The work done in this 
direction becomes of immediate use to the student of social 
questions. The average wage and the grouping about it 
and the change in these quantities present precisely similar 
problems ; the change in the purchasing power of money is 
calculated by the same mathematical formulae; in fact, these 
methods furnish the only accurate way of measuring numerical 
changes in complex groups. Much valuable information has 
been collected in anthropometrical laboratories, which has in- 
creased the statistician's knowledge of facts and given birth to 
important theoretical principles. 

Meteorology has much in common with statistics. The chief 
measurements taken for the purposes of this science are of 
statistioi and temperature, barometrical pressure, moisture of the 
met«)roiogy. ^ir, and force of the wind. One of the problems 
attacked is again that of finding' the type from a group of 
observations, and of measuring its change. The tables which 
state the average temperature year by year are in many ways 
similar to those which the Registrar-General publishes of births, 
deaths, and marriages. Without the aid of statistical method, 
the averages obtained show me^e numbers from which no logical 
deductions can be made. With the help of this knowledge, it 
can be seen whether the change from year to year is significant 
or accidental ; whether the figures show a progressive or periodic 
change; whether they obey any law or not. The problem is 
easily seen to be of importance for forecasting the future 
population and for many similar purposes. 

We are thus brought by a short step to the province to which 
statistics has sometimes been confined : the study of demography. 
statiBtioaaad If in demography we include, not merely the 
demography, measurement of the numbers of the population, 
the birth, marriage, and death rates, the distribution by age, by 
sex, and by locality, in fact, the figures which naturally come 
from the census and the Registrar-General's returns ; but include 
also, industrial and social measurements, of distribution of the 
population by trade, of income, wages, production, foreign trade 
transportation, and so forth ; we have extended the limits of 
demography till it includes the majority of the statistical 
investigations directly interesting to students of sociology 
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SCOPE AND MEANING OF STATISTICS. J 

or of political economy. Without stopping to decide the 
exact limits of demography, we can quickly pass to another 
definition of statistics (so far as it concerns such students) on 
which it is wished to lay a certain stress : statistics is the science 
of the measurement of the social organism^ regarded as a whole, in 
all its manifestations. In a monograph, after the 

Statlstioi relate & jr > 

tothesooiai fashion of Le Play, a single family is studied ; the 
"'^"hSr*** occupations and earnings of its members, the way 
these earnings are spent, and its economic position 
generally are set down ; but this study is not so far statistical. 
In demography we study the same quantities when* groups of 
families are concerned ; the number of families engaged in certain 
industries, and their average receipts, expenditure, and savings ; 
here we have statistics. In the monographic method the indi- 
vidual is everything ; in the statistical method, nothing. When 
we wish to obtain a measurement of the group, peculiarities 
of individuals receive no attention ; it is only when the same 

peculiarities are prigg^^gg^^i^J^yjiriany p^rcrmc fKaf fh^y hprnme nf 

importance. Statistics may righ tly hp r?llirf1 thr s<;ience of 
averages. In the tneasurement of a complex group, say of 
incomes and wages, the exceptional artiste who can earn ;^ioo 
in an evening, and the inefficient labourer who can only make 
sixpence a day, affect only slightly the general average ; they 
are not entered in separate categories ; but the large group of 
skilled artisans who can earn over forty shillings a week, or of 
casual labourers who make less than fifteen shillings, are entitled 
to separate notice. The exact specification to be adopted is only 
a question of degree, which differs with the nature of the par- 
ticular investigation in hand. The object of a statistical estimate 
of a complex group is to present an outline, to enable the mind 
to comprehend with a single effort the significance of the whole. 
To do this it is necessary to exclude rigorously any presentation 
of details, for the same reason that, in a painter's rendering of a 
tree, the individual leaves are not distinguished. The outline 
will be a little blurred, a little inaccurate ; but it will be as 
distinct and detailed as the mind has power to grasp it, or the 
eye to see it ; the impression will be rightly given. There is a 
very important principle involved in this method. The individual 
members of a group vary continually, the whole group varies 
very slowly. It is impossible to follow or measure the motions 
of separate atoms ; it is comparatively easy to state the laws of 
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motion for a solid body. Great numbers and .the averages 
resulting from them, such as we always obtain in measuring 
social phenomena, have great inertia. The total population, the 
total income, the birth and death rates, average wages, change 
very little ; similar quantities relating to a single family change 
very fast. It is this constancy of great numbers that makes 
statistical measurement poss ible. It is^ Td" great n umbers that 
statistical measureme nfcKlefly ao pJiesT 

The relatioT! oTstatistics to political economy is a simple 

one. Professor Marshall says,* " Statistics are the straw out of 

siatistioa and ^^^^^ I» ^^^^ every Other economist, have to make 

pouuoai the bricks." The st^lisJician jumisl^ the political 

economy. econo mist with J he_facts^b^ which hete^ts "Ids 
theories or on whicjx he. bases them. Since the economist deals 
chiefly with phenomena relating to groups, and regards the 
individual only as a member of a group, it is to statistics as 
the science of averages that he looks for his information. When 
he is dealing with national economy, with the volume of trade, 
for instance, or the purchasing power of money, he is limited to 
pure theory, till statistics as the. science of great numbers has 
provided the facts. The chemist experimenting in his laboratory 
is like the statistician ; the chemist theorising in his study is like 
the economist. Because of this relation it may be held to be the 
business of the statistician to collect, arrangfe, and describe, like 
a careful experimentist, but t o iiraw no d eductions : even in an 
investigatkm j^latingto cause_and effectJLO- £resen j ^vidence but 
not conclusionsT^As a distinct operation, of course, the statistician 
may assume the r61e of the economist, for the same man may well 
be fitted to conduct the experiment and fit the theory. And just 
as a theoretical chemist will have little or no power unless he 
fully appreciates experimental methods and difficulties, even if 
he has not the manual dexterity to conduct them to perfection 
himself, so no student of political economy can pretend to com- 
plete equipment unless he is master of the methods of statistics, 
knows its difficulties, can see where accurate figures are possible, 
can criticise the statistical evidence, and has an almost instinctive 
perception of the reliance that he may place on the estimates 
given him. 

The proper function, indeed, of statistics is toenlargeJndJL.. 

* Evidence to the Committee on the Census, 1890, 
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SCOPE AND MEANING OF STATISTICS. 9 

vidual experience. An individual is limited to what he can 
stattrtifli ^r^r^"^^^^^ see, a very small part of one division of 

indiindiua the social organism ; his knowledge is extended in 
•3n»«i«tt06. various ways, by the conversation of his acquaint- 
ance, by newspaper reports, by the writings of experts. Accord- 
ing to his ability and power of judgment, he will be able to form 
a correct view of the numerical importance of groups of persons 
and things ; but it is in the highest degree improbable that he 
will not have been biassed by the peculiarities of his position, 
and that he will place his different items of information in the 
right perspective ; and he will not be able to gauge rightly the 
accuracy of his data. As soon as he begins to examine these 
points he is undertaking a statistical investigation, and will very 
soon find himself involved in all the difficulties and problems 
from which a knowledge of statistical method alone can dis- 
entangle him. This is the obvious answer to those who deny 
the use of statistics. A statistical estimate may be good or bad, 
accurate or the reverse ; but in almost all cases it is likely to be 
more accurate than a casual observer's impression, and in the 
nature of things can only be disproved by statistical methods. 

A chief practical use of statistics is to showrelative import- 
ance . the veryj th in^ wh ic h an individual i& njcply fro misj iiflge. 
staUftioBare Statistics are almostalways comparative. The ab- 
oompftraUTB. solute magnitude of a quantity is of little meaning 
to us till we have some similar quantity with which to compare 
it. A statement of the number of paupers in the United King- 
dom is valueless unless we know the tptaLgggulation. A state- 
ment of the number of gallons of water supplied per head to the 
people of East London is of little meaning to us till we know 
the quantity supplied to other towns. The average wage, shown 
in the Wage Census, does not convey its full significance till we 
have similar computations for other countries or relating to other 
years. I n the case of most sta tistical cstimatesj jt will be jound 
that we need another for comparison before we can appreciate 
the meaning of the first. * 

If the group of objects which we wish to measure is large, 
its enumeration will be beyond our unassisted efforts, or those of 

OJBoiai any organisation at our command. Some investiga- 

ttatistios: tions, indeed, have been successfully conducted by 
private organisations, for instance, those which resulted in Mr 
Booth's Life and Labour of the People^ and Leone Levi's Wages and 
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Earnings ; but in general the measurement of a part of the social 
body or industrial organism must be undertaken by the central or 
local governments, if it is to be successfully carried out. The fact 
that this is the case explains the heterogeneousness and imper- 
fection of the mass of statistics extant A government naturally 
collects numerical information only in relation to its own func- 
tions. Thus the administration must know the numbers of the 
population and the area of the country in gross and in detail for 
its own purposes. Large groups of figures come simply from the 
necessity of public account-keeping. Many official figures are 
bye-products; for office purposes an account is kept of all 
transactions in which the government has a hand, and of in- 
dustries subject to special regulations ; and the government 
publishes most of the figures which thus come in its way. To 
such causes are due our knowledge of the statistics of income, 
education, imports, railways, mines, factories, and so on. Some 
such publications are only survivals from a former time, when 
the figures were directly needed, such as Gazette wheat prices 
(used for the calculation of tithes), and, to some extent, statistics 
of exports. Though few figures are collected simply for scientific 
purposes, yet in many cases schedules issued for administrative 
ends are used at the same time for the reception of other 
information, of use chiefly to the sociological student ; much of 
the Census information comes under this heading. A view of 
those figures, relating to the United Kingdom, which are easily 
accessible to the student, can be obtained by turning through 
the annual Statistical Abstract for the United Kingdom^ the 
AnnucU Abstract of Labour Statistics, and the Registrar-Generats 
Annual Report ; in one or other of these, summaries of, and 
references to, most official statistics are to be found. 

It is clear that figures collected simply in connection with 
administrative purposes are not likely to be precisely those 
their which are needed by the student of sociology or 
iBoompieteiieM. political economy. Even where the wants of the 
official and the student are nearly identical, the classification and 
tabulation may not meet scientific requirements. There has, 
indeed, been considerable progress in recent years, due one may 
suppose to the influence of Sir R. Giffen, in the direction of 
amassing statistical information not absolutely needed by the 
administration, and most of the work of the Labour Department 
is of this kind ; but very much more might reasonably be done, 
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at an expense which would be almost negligible when considered 
in relation to the national income. Thus the census might be 
made, in part at least, quinquennial, and the body of workers, 
who are organized once in ten years to conduct it, only to be 
disbanded when the report is issued, might be made permanent 
and entrusted with the organization of a decennial industrial 
census. Market prices of many staple commodities could be 
tabulated by local officials in the same way as wheat sales 
are now registered. Movements of goods by rail could be 
tabulated in the same way as transport by water, and the 
anomaly that we know more of our foreign than of our home 
trade be removed. The production of factories might be re- 
turned as well as that of mines. A permanent government 
office might well be charged from time to time with special 
investigations, similar perhaps to the Wage Census of the Board 
of Trade. It needs very little study of statistics or of political 
economy, to feel the pressing need of some of this information. 
Btatiitioiipeoi- Attention may be drawn to some of the gaps in our 
•uy needed, knowledge. When dealing with our national income 
we can obtain statistics of wages, and of income subject to tax ; 
but for salaries below the exemption limit, and for part of the 
income received for foreign investments, we are forced to rely 
on educated guesses. For the change of the purchasing power 
of money we know, thanks chiefly to the Economist and trade 
newspapers, the course of wholesale prices, but many interesting 
calculations are brought to a standstill because of the complete 
dearth of records of retail prices. With regard to wages, we can 
estimate fairly accurately standard and average wages, but, in 
default of an industrial, census, do not know how many persons 
are in receipt of each given wage, nor the relative numbers of 
masters and men. We know fairly well the mass of trade that 
leaves or reaches our shores, but as regards the far greater mass 
of our internal trade our ignorance is almost complete. Till 
there is a public demand for such information, it will need a v^ry 
enlightened government to spare the time, trouble, and expense 
necessary for a systematic attempt to fill up these gaps ; but 
we can all do something towards this enlightenment, and in 
furtherance of this demand, by studying what has been done 
in other countries, and building up a knowledge of the science 
of statistical investigation. 

The absence of such a demand is perhaps due to a widely 
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spread and not unreasonable distrust of statistical estimates, 
Diftnutof crystallized in the common remark that "anything 
gututiot: can be proved by statistics." This is to a great 
extent the fault of the criticising public themselves : they are 
always requiring and the newspapers always supplying informa- 
tion, which depends on a statistical basis, but for which good 
statistics are not to be found for one or other of the reasons 
already indicated. The informant must perforce 
turn to inaccurate estimates, and the public has no 
knowledge or discrimination as to what estimates rest on satis- 
factory data, or indeed as to what quantities are capable of 
statistical evaluation. Again, figures which cover only part of 
the subject, such as the Wage Census average, or the Labour 
Gazette returns of unemployed, may be quoted as universal ; mere 
estimates, made for quite other purposes, may be given as 
accurate and complete ; and on such unreliable premises argu- 
ments are based, which naturally, by a judicious choice of 
material, can be made to support any theory at pleasure. It 
will generally be found that the statistician, on whose authority 
such statements are supposed to be based, is not to blame. 
Some of the common ways of producing a false statistical 
argument are to quote figures without their context, omitting 
the cautions as to their incompleteness, or to apply them to a 
group of phenomena quite different to that to which they in 
reality relate; to take estimates referring to only part of a 
group as complete ; to enumerate the events favourable to an 
argument, omitting the other side ; and to argue hastily from 
effect to cause, this last error being the one most often fathered 
on to statistics. For all these elementary mistakes in logic, 
statistics is held responsible. 

Perhaps statisticians themselves have not always fully recog- 
nised the limitations of their work. At best they can measure 
umitatioiis of only the numerical aspect of a phenomenon j while 
itatiitios. ygj-y often they must be content with measuring 
not the facts they wish, but some allied quantity. We wish to 
know, for instance, the extent of poverty, its increase or diminu- 
tion : poverty we cannot define or measure, and we cannot even 
count the number of the poor ; all we can do is to state the 
number of officially recognised paupers, and add perhaps some 
estimates from private sources ; but this gives us no clue to the 
intensity of poverty in individual cases. Or we wish to obtain 
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statistics of health : all we can measure is the death-rate and 
average length of life, very different matters. The statistician's 
contribution to a sociological problem is only one of objective 
measurement, and this is frequently among the less important 
of the data; it is as necessary, however, to its solution as 
accurate measurements are for the construction of a building. 
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CHAPTER II. 

THE GENERAL METHOD OF STATISTICAL 
INVESTIGATION. 

At first sight it will seem as if there were no method common 
to all statistical investigations, and indeed the processes differ 
so widely that it is not easy to outline a scheme which will 
include them all ; but the following sequence is generally 
indicated* as of general application, and will serve at least 
to thread an examination of methods together : (i) the Collection 
of Material, (2) its Tabulation, (3) the Summary, and (4) a 
Critical Examination of its results. These processes will be 
discussed in detail in the following chapters. 

It may be well to state what equipment is necessary for the 
student who wishes to learn statistical methods. In collection 
and tabulation common-sense is the chief requisite, 
y^t^^ and experience the chief teacher ; no more than 
[ w a knowledge of the simplest arithmetic is neces- 
sary for the actual processes ; but since, as we shall 
see immediately, all the parts of an investigation are inter- 
dependent, it is expedient to understand the whole before 
attempting to carry out a part. For summarising, it is well to 
have acquaintance with the various algebraic averages, and 
with enough geometry for the interpretation of simple curves, 
though all the operations can be performed without the use of 
algebraic symbols. For criticism of estimates and interpre- 
tation of results, it is necessary to use the formulae of more 
advanced mathematics, and it is obviously expedient to under- 
stand the methods by which these formulae are obtained to 
ensure their intelligent use. They are specially necessary for 
the comparison of complex groups, and for estimating the 
significance of a divergence from the average, or the deviations 
in a list of periodic figures. 

* See, e,g,y Dr Bertillon's Cours H^mentcdre de Statisiique, to which the 
present author is indebted for some of the treatment in the following pages. 

B 
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(i.) Information is generally collected by issuing blank circulars, 
forms of inquiry, to be filled in either by a few officials or by many 

ooiiMtion: individuals, and the proper drawing up of this 

UMikformi; fQ^m is one of the chief tasks in a good investiga- 
tion. Before this form is i^ued it is necessary to formulate 
a complete scheme of the whole undertaking, and even to have 
some idea of what the resulting figures will be, so as to be 
able to arrange the details of the organization on the right 
scale, and adjust the tools used to their purpose. As already 
pointed out, the object whose measurement is wanted is not in 
general exactly that which can be measured, and the measur- 
able quantity nearest to it must be found ; ^.^., when the average 
annual earnings of the working class were in question, the 
quantity first measured was the average weekly wage. Then 
some technical knowledge of the particular subject is needed ; 
and, if not possessed, a preliminary inquiry on a small 
scale may be necessary to show how to fit means to ends. 
The people who possess the information required must be 
discovered and interrogated at first hand. The questions put 
must be those which will yield answers in a form ready for 

natnnoftiia tabulation, and the scheme of tabulation must 
quMUoni. therefore be thought out beforehand. The ques- 
tions must be so clear that a misunderstanding is impossible, 
and so framed that the answers will be perfectly definite, 
a simple number, or ** yes " or " no." They must be such as 
cannot give offence, or appear inquisitorial, or lead to partisan 
answers, or suppression of part of the facts. The mean must 
be found between asking more than will be readily answered 
and less than is wanted for the purpose in hand. The form 
must contain necessary instructions, making mistakes difficult, 
but must not be too complex. The exact degree of accuracy 
required, whether the answers are to be correct to shillings or 
pence, to months or days, must be decided. Every word and 
every square inch of space must be keenly criticised. A 
little trouble spent upon the form will save much inconvenience 
afterwards. 

(2.) In considering what method is to be adopted for tabula- 
tion, we must remember that the investigation is intended to 
furnish the answers to certain definite questions — 
how many people, what wage, what price — and each 
column must present some total which is relevant to these 
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questions. The exact scheme employed will differ in different 
inquiries. In the population census, the tabulation is almost 
automatic ; in the wage census, the best and simplest way to 
show the grouping about the average wage in each occupation 
had to be specially devised ; in trade statistics the number of 
different categories to be adopted and the limits of each raise 
difficult questions. In general, the scheme of investigation re- 
quires knowledge of certain groups ; and the totals resulting 
fron) tabulation should show the numbers of items in these, so 
that after tabulation, instead of the chaotic mass of infinitely 
varying items, we have a definite general outline of the whole 
group in question. 

(3.) When the raw material is worked up to this point, skill of 
a different kind is wanted. From the numbers obtained, we 
AfmgingtaiA have to pick out the significant figures; so to 
swnmarisauoiL present the totals and averages 'as to give a 
true impression to an inquirer ; to summarise briefly the 
information obtained ; to concentrate the mass into a few 
significant averages, and to describe their exact meaning in 
the fewest and clearest . words, for it is tlie result of this 
concentration which will generally be used and quoted. To 
do this skilfully requires an acquaintance with the method of 
averages and the use of diagrams. It may further be necessary 
to fill in unavoidable gaps in the figures in order to supply esti- 
mates for intermediate years; this needs a study of the dangerous 
method of interpolation. Finally, the verbal description of the 
process, its genesis and results, and an estimate of its accuracy 
must be added, and then the investigation is complete. 

(4.) The student who has to make use of statistics should not 
be content to take the results of an inquiry on authority, but 

oritioismor ought to acquaint himself with all these details of 

""^^- method. Before the results can be criticised, it 
is necessary to know the complete genesis of the figures ; 
whether the whole field was covered ; exactly whence the 
information tabulated was obtained ; whether there was a 
possibility of bias; how nearly the individual answers were 
correct; whether the informants really knew the facts they 
related, and if they were likely to state them correctly. The 
published statement of the results should show clearly the 
whole scheme of collection so as to make this criticism possible ; 
in particular, specimens of the original blank forms should be 
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included, so that the reader can judge whether the original 
answers correspond exactly to the form of tabulation employed. 
Internal evidence often leads to much useful criticism. It can 
be seen whether the number of returns for each group is 
proportional to its importance, or if a specially important 
figure depends on only slight evidence. The continuity of the 
figures can be examined, and the causes of sudden gaps in- 
vestigated. The returns can be divided into sample groups, 
and the extent of the correspondence of these groups to^ the 
general result will often indicate whether the returns are 
sufficiently general. A careful study of the more minute 
tabulations may show within what percentage the final numbers 
may be expected to be correct. 

^ The most important function of statistics is to produce 
I evidence showing the relation of one group of phenomena to 
' another ; for the information obtained is presumably intended 
as a guide for action, the guidance is generally needed to show 
what actions are likely to produce certain desired effects, and 
this is best investigated by finding how such effects have been 
produced in the past. We have then to determine whether 
changes in one measurable quantity {e.g,, the duties on com) 
have produced changes in another (^.^., the amount of pauperism); 
a problem generally insoluble, but one on which most light 
can be obtained by the study of the relevant statistics in the 
light of mathematics, the mathematics of probability, and it is 
in this particular branch of mathematics that recent statistical 
progress has been chiefly made. 

Such questions, however important, are somewhat abstruse, 
and presuppose a certain amount of technical knowledge which 
is not in the possession of the general student. The plan of 
this book is to postpone all questions requiring such technical 
or mathematical knowledge to the Second Part, and to confine 
our earlier discussions to problems needing no special training 
or equipment 
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CHAPTER III. 
ILLUSTRATIONS OF METHOD. 



Section l— The Population Census. 

The population census will provide good illustrations of the 
principles laid down in the last chapter, both because we shall 
be at first on familiar ground, since every one knows 
its scheme, purpose, and details, and because the 
form of inquiry used for the collection of the original data 
brings out very prominently the difficulties met with in detailed 
statistical investigations. 

The first thing to be considered is the exact object for 
which the census is undertaken. It is for demographical pur- 
poses; to supply information as to the numbers and 
local distribution of the population, the numbers of 
each sex and age, their so-called civil condition (/.^., whether 
single, married, or widowed), and their nationality. This is the 
minimum information necessary for administrative purposes. 
In addition to these facts there are very many others which the 
statesman and the economist wish to know about each member 
of the population, and the census form is the only means in 
England of collecting universal data ; the question as to which 
of these shall be investigated and which neglected, is decided 
T]&e<dioio«of more by expediency than on principle. Of these 
i"******^ desiderata the following may be mentioned : the 
size and structure of the family, its position in the social scale, 
the economic position of its head ; the nature of employment 
of its members, the wage or income of each member and of the 
family as a whole, the rent and size of their house, their educa- 
tional condition, the ages at which they commenced or retired 
-from work, their migrations, their combination in religious or 
other bodies, and their infirmities. It is clear that some of 
this information must be dispensed with, if the form is not to 
be overcrowded, and if the tabulation is to be finished in any 
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reasonable time ; and an examination of the general nature 
of the questions which can suitably, be put will show how the 
necessary selection is made. 

First, the questions must be those which the informant is 
J able to answer. Now, if the questions were only to be put 
Auutj to educated and methodical persons, doubtless a 
to answer, f^n account could be given of the family migra- 
tions and of the ages at which each member had been at work ; 
but the peculiarity of the census is that it is universal, and 
the questions must be such that the least educated and most 
unthrifty householder shall be able to answer ; in many cases 
such facts would have been unrecorded and forgotten. 

Secondly, the questions must be perfectly definite, so that 
there can be no doubt as to what the right answer should be. 
The only answers which are of value to the 
^statistician are "yes," "no," or a simple number. 
Adjectives and adverbs such as many, often, partly, &c., bear 
different numerical meanings to different people, and, though 
they may express fairly clearly the position of an individual, 
are nearly useless for tabulation,* which is their only purpose 
so far as the census is concerned. Thus the question as to 
education would have to be, not " state whether well, moderately, 
or badly educated," but " state at what age school was left," or 
"how many years at school?" But even if such questions 
were not excluded by our first test, by the forgetfulness of the 
informant, the statements given would be of little practical value, 
and very often incorrect. An inquiry as to wage and income 
could not be made sufficiently definite without so many questions 
as to require a form to itself ; for wages, as we shall see when 
considering the Wage Census, require very careful definition, and 
many subsidiary questions must be put to get a proper estimate ; 
the simple query, "what is your weekly wage or annual income?" 
would be answered on so many varying principles that the result 
would be valueless. 

Thirdly, the questions must be such as will be answered 

I truthfully and without bias. There is hardly a demand on 

the census form which would not be excluded, if 

^' this rule was too rigorously enforced, as we shall 

see immediately. The worst offender in this respect is the 

* But see p. 138, infra. 
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question, Employer or employed? For though there are many 
cases in which a man is both employer and employed so that 
this question should be excluded by our second test, many 
persons consciously exaggerate their social importance by 
erroneously replying the former. Questions relating to sociali 
position must generally be excluded by this rule. ' 

Fourthly, the questions must be those which will be answered . 
r willingly, and must therefore not be inquisitorial, or such as ( 
Btfnoiaiioeto to raise apprehension of a change of law or an 

•"**^"- imposition of taxes. Questions as to membership 
of trade unions, or of friendly societies, or as to insurance, 
would be thought inquisitorial. Many would refuse to state 
their incomes, holding it to be no one's concern but their own. 
Questions as to rent might be regarded as possibly leading 
to taxation. Questions as to religion are badly answered, as 
was shown in the evidence before the Census Committee of 
1890,* and should be excluded by each of these four rules. 
Some persons do not know what their religion should be 
named, others would find the question indefinite, others would 
deliberately answer wrongly, and many not at all. 

The questions on the census formf not excluded on one 
or other of these grounds are Nos. i, 2, 3, 4, 5, and 10; 
these are fairly definite, and householders are generally able 
and willing to give correct answers to them. Questions 6, 7, 
8, 9, and 11 compete with many others, which lead to equal 
inaccuracies, for a space on the census sheet. No. 6 has long 
held its place because of its great importance ; Nos. .7, 8, 
and 9 are on their trial. A further discussion .of the merits 
of some of these is to be found in the Reports of the Com- 
mittee already mentioned ; here it is only intended to indicate 
the general grounds of inclusion or exclusion. 

So far we have not discussed the important question as to 

who should fill in the form. If, as in the English Census, 

Fining up of it is to be filled in by the householder, the ques- 

theform. tions must be much simpler in matter and words 
than if it is to be filled in by an official teller. In the latter 
case the form may be much more complicated, the questions 
more inquisitorial and such as might lead to indefinite answers 
on the part of ignorant people ; for the teller would insist on 

* Report of Committee on the Census, 1890 (C. — 6071). t Facing p. 23. 
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an answer, be able to exclude those obviously wrong, and 
cross-question till the indefinite answers were so altered as to 
allow definite tabulation. In a great and complex undertaking 
like the Census, where many tellers must be impressed for a 
single day's work, their instructions and the general plan must 
be sufficiently simple ; but as the extent of an inquiry con- 
tracts, the tellers can receive more complete instructions, and 
the information requisitioned may be more complex. This is 
of most importance in connection with columns 6-9. 

The general shape and appearance of the sheet needs 
attention. If the structure of the family is to be shown, the 
aiiApe of UADk answers are best given on a single sheet, which 
**™- must contain enough lines for the largest ordinary 
household, so that the trouble of fastening together of many 
couples may be avoided, and tabulation not be hindered. The 
spaces must contain plenty of room for answers in uneducated 
handwriting, without making the whole so large as not to lie 
easily on a desk. The instructions must be distinct and visible, 
and placed in close connection with the answers ; to further 
this, a skilful use may be made of capitals, italics, and different 
founts of type. On the form facing p. 23, those in use are 
roughly reproduced in miniature. 

The form should always show for what purpose the figures 
are collected, and how they will be used, in order to enlist the 
PvxpoM to bo support of the informant and allay misapprehension. 
■***^**- The extent to which this should be done depends 
a good deal on whether the filling-up is compulsory, as in 
the population census, or voluntary, as in the wage census. In 
the case before us no preamble is necessary, since every one 
knows the main features of a census, and most are willing to 
further its objects ; but it must be shown that the inquiry is 
sanctioned by Parliament, and that compliance is compulsory. 
This is done -on the back, on the fold which is outside before the 
form is opened ; and even though penalties are threatened 
against absence of or falsification of returns, the last sen- 
tence describes the object of the inquiry and guarantees the 
informant against malicious use of his answers. Where in- 
formation is voluntary, a careful letter should be printed and 
circulated with the form, persuading the informant to give his 
assistance. 

While the main part of the form is filled in by the house- 
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holder, other parts are filled in by the officials, and with very 
BvMditty little trouble a good deal of subsidiary information 
iBfonnatioii. ^^n be collected in this way. On the outside the 
Parish, Town, Sanitary District, Street, and Number are endorsed, 
so that the answers can be tabulated for any of these districts. 
The teller could also, as he took the form, enter the number of 
stories to a house, which is not done in the English Census, and 
other information as to the style of house and street might be 
endorsed. In a more intensive investigation, Mr Charles Booth's 
assistants, for instance, could be trusted to come out of a house 
with an accurate knowledge of many interesting details. 

We can now proceed to the individual criticism .of the form 
in the light of the rules suggested above. In the first place, 
LinMaad even the arrangement of columns is not perfect. To 
ooiunns. labourers who are not in the habit of writing at all, 
and who have (to judge from election posters) to be instructed 
how to put their mark in the right place on a ballot paper 
(many papers being destroyed simply through ignorance), this 
arrangement of horizontal and vertical columns would be con- 
fusing, and without help they would not gather at all what they 
were to do. They would fill up more easily a paper in which 
the answers were to follow the questions immediately : — 

State your Name . 

State your Age 

State your Sex 



Unmarried, Married, or Widowed . 
and so on. 



This form, however, could only be used if a separate paper were 
to be filled in for every individual, children and all. Other 
elementary matters might be improved. On looking through the 
form a great number of words and phrases will be found which 
are not in common use, e,g., abode, dwelling (as a noun), else- 
where, East Indies, imbecile, "precise" infirmity, general term, 
column, the foregoing, condition as to marriage. In column i 
the phrase, " name and surname " reads as though surname were 
not a name, and perhaps the word " surname " is not in general 
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use, SO that the printed word might be taken to mean title, and 
the confusing answer "none" written under it Does the in- 
struction " write after " mean to the right, or below ? 

The first question, which for the general purpose of the 

census should be the most definite of all, leaves some room for 

Giittoiimoftiie doubt. What of a night-watchman returning at 

qiMftioxiB. 4 A 1^^ or a printer at 2 A.M. ? What constitutes a 
traveller : does a man who leaves the house before midnight, or 

"8ieptor a man who goes down to Brighton by the theatre 

abode." train come under the term ? Is midnight or 2 A.M. 
the critical time ?'* What of a person who dies at i A.M., or a 
birth at midnight ? How is the householder to know whether 
any of his establishment are returned elsewhere? Since too 
many instructions only lead to confusion, the tellers should be 
specially taught the answers to such questions. 

The very meaning of the phrase " population of a district " 
is open to much doubt. In France "la population de fait," 

Meaning of which consists of all present in the given district 

population, at the given moment, is distinguished from "la 
population de droit," which consists of all usually resident in the 
district, including those temporarily absent, and excluding those 
only momentarily present, and from " la populatioi^jnunicipale," 
which is "la population de droit," less prisoners, hospital patients, 
scholars resident in schools, members of convents, the army, and 
so on.* The English Census counts "la population de fait." 
In the United States we find a/" constitutional population," 
which excludes residents in Indian Reservations, the Terri- 
tories, and the District of Columbia ; the " general population," 
which includes in addition the Territories (except the Indian 
Reservation, Indian Territory, and Alaska) ; and the *^ total popu- 
lation," which includes all excluded in the former.^ In the 
future questions will arise as to the inclusion of the Philippines 
and Cuba. Notice that the Channel Islands and the Isle of 
Man are included in the English Census. 

* See Bertillon, ibid.^ p. 146. 

t Willcox : Area and Population of the United States at the XL 
Census^ a book which gives a very useful criticism of the accuracy of the 
most elementary data of statistics. It is a pity that space is wasted in a 
useless attempt to supplant the word "statistician," which has now a definite 
meaning, by the word " statist," which has another equally definite meaning. 
Does Dr Willcox wish to substitute "statics" for " statistics"? 
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It is possible to find difficulties in filling up all the columns 
except No. 4. For illustration, consider how column 2 should 
be filled in in the case of a cousin who was a " paying guest," or 
a relation who was a visitor ; for column 3, is a divorced person 
single or a widower, and what of a woman who is doubtful 
whether her husband is lost at sea? Errors come from No. 3 
because many unmarried people call themselves married. 

It is well known that column 5 is wrongly filled in for two 
reasons — one, that elderly people often do not know their ages 
accurately and enter them to the nearest round 
number, so that the returns congregate at 40, 50, 
60: the error thus arising is eliminated by tabulation in the 
groups 35-45, 45-SS years, &c., and for more minute tabulation 
the groups 3-7, 8-12, 13-17, &c., are suggested : the other is that 
many ladies habitually enter their ages too low ; in this case 
also the Registrar-General is able to deduce nearly correct 
totals. 

It is to. be noticed that, since the ages stated are those 
"last birthday," the age will on the average be given six 
months too low, and, in fact, the ages given as 17, e.g',, should 
be scattered nearly uniformly over the months to the eighteenth 
year. 

The most important criticisms of the census-schedule are to 
be made on columns 6-9. It will not be expedient here to go 
into all the questions raised before the Committee 
on the Census as regards an industrial census. 
While there can be little doubt that a thorough census of occu- 
pations would be best undertaken separately, and on somewhat 
different principles from the population census, it is certainly 
better, till opinion is ripe for so radical a change, to include 
in the present census the best questions we can as to occupa- 
tions, than to omit them altogether in despair of accurate 
results. 

The objects aimed at, which we must always keep in mind 
when criticising special questions, are two : to find the number 
employed in each trade and industry, that is, so to say, to 
form vertical divisions; and to find the number in each rank 
or grade of employment (labourer, artisan, employer, &c.) in 
horizontal divisions ; so that the tabulation may give some such 
result as — 
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Textile Industries. 





Cotton. 


Wool. 


Linen. 


Totals. 


Employers 

Managers 

Overlookers - 

Spinners 

Weavers 

Labourers 

Children 










Totals 











The necessary minimum of information would be given by 
such answers as 

Legal — Solicitor — Managing clerk. 
Mining — Coal — H ewer. 
Metal-worker — Iron — Smith's striker. 

Now the simple instruction, " State your occupation," would of 
course not lead to information of this sort. The coal-hewer 
would simply say miner ; the clerk, managing clerk ; the striker, 
very likely smith. To explain what is wanted and avoid mis- 
takes, the question is not put on the face of the form at all, but 
the informant is referred to the back, half of which is devoted to 
instructions relating to this column. These are lucid, carefully 
picked out with capitals and italics, comprehensive, brief and to 
the point. No one who wishes to fill in the form rightly, and is 
sufficiently educated to understand simple instructions, can easily 
go wrong. Yet, as a matter of fact, these instructions are in very 
many cases neither read nor followed ; and this fact is very im- 
portant in connection with the general study of blank forms of 
inquiry. Forms issued to people uninterested in the object in view 
will generally be filled in with the least possible expenditure of 
time and intelligence.' Hence two courses are open / to reduce 
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the question to the simplest possible form, and make the best of 
the result; or not to allow the informants to write in their own 
answers, but to take them vivd voce by means of a teller, who 
has mastered the instructions, and has the necessary legal force 
behind him to compel information. The latter course entails 
time and expense. 

The result of the present system of inquiry, combined with 
a faulty method of tabulation, which it to some extent makes 
necessary, is that we have no reliable census of occupations for 
the United Kingdom. The present figures break down both 
from faulty data and from insufficient tabulation directly we 
attempt to make any calculations depending on them. 

An attempt has been made to correct to some extent our 
ignorance of the relative numbers of unskilled and skilled 
TberesQiiofthe labourers, employers and employed, by columns 
ii«w<iQMtioiit. y^ g^ and 9. The headings are not a model of 
clearness ; there is not the ordinary imperative " state " or 
" write," nor is one told on the front of the form whether to 
write Yes or No or to make a mark in the appropriate column, 
nor is the distinction between the three headings a perfectly 
definite one ; but still one is hardly prepared for the following 
statement in the report : * — 

" In numerous instances, no cross at all was made ; in many 
others, crosses were made in two or even all three columns, and, 
even when only one cross was made, there were often very 
strong reasons for believing that it has been made in the wrong 
column. Oftentimes this use of the wrong column can scarcely 
have been other than intentional ; being dictated by the foolish 
but very common desire of persons to magnify the importance 
of their occupational condition. This desire must have led 
many subordinates to return themselves as employers rather 
than as employed, for it is only on this supposition that we can 
account for the otherwise unintelligible fact that, under several 
headings, there are actually, according to the returns, more 
employers than employed, more masters than men. . . . We 
hold [these returns] to be excessively untrustworthy, and shall 
make no use whatsoever of them in our remarks." 

This attempt and its result are of the greatest importance to 
all who try to draw up forms of inquiry. 

♦ General Report on the Census of iS^i^ p. 36 (C, — 7222 of 1893). 
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Before leaving the subject, it should be mentioned in passing 
that we cannot deduce directly from our census the number of 
persons dependent on a particular trade for their living ; that is 
to say, the number of employers, their families (not otherwise 
returned) and domestic servants, and the number of employes 
and their dependent families. This, the most important total 
for estimating the relative importance of different trades of the 
country, is not tabulated, though such tabulation has been found 
possible in other countries, and we are dependent on the esti- 
mates of statisticians for such totals.* 

To see how the information given by the answers on the 
census schedule can be worked up into detailed specific numbers, 
it is only necessary to look at the diagram and table prefixed 
to each of the sections relating to special trades in Mr Booth's 
Life and Labour of the People {e,g,, vol. v., p. 46). f 



* See Booth in Statistical Journal^ vol. xlix. t See p. 78, infra. 
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Section 2. — The Wage Census. 

The main differences between the wage census, taken in 
1886, and the general population census are — (i) That the 
filling up the forms in the wage census was voluntary ; 
(2) that their correct filling up required a higher degree of 
intelligence and education. As before, we must consider first 
the object which the wage census was intended 
to fulfil: it was to describe the earnings of the 
people of the United Kingdom, to compare the rates of wages 
trade by trade, and to find the relative numbers earning 
at each rate. What is the best quantity to measure with this 
object in view? As a preliminary question should we take the 
TiMimit of day, week, or year as the unit of time ? Clearly we 
**"•• shall not be able to compute weekly wages if we 
only obtain daily, for the week's work varies from four to seven 
days in different occupations. The week's wage is a more 
definite quantity ; but the simple comparison of weekly wages 
in different trades will be deceptive, because most trades are 
busier at one season of the year than at another, and in many 
the difference between season and season is very great ; in any 
particular week, then, we may be comparing the best season of 
one industry with the worst of another. To avoid this error, 
and because we do not know how many full weeks' wages are 
/Obtained in a year, except in a few non-intermittent trades, it 
(would seem best to take the year as unit ; but the direct cal- 
culation of an individual's annual earnings is practically impos- 
sible. The employer is not acquainted with this sum, for in 
large establishments the hands are continually changing, and 
one man will be paid by two or more masters in the same 
year ; and even in a factory with a nearly constant personnel, 
the weekly amounts paid to individuals are not in general so 
tabulated as to be easily summed, and the working out of the 
totals would require a prohibitive amount of clerical labour. If 
we turn to the workman, on the other hand, we shall find in the 
majority of cases that no accurate account has been kept of 
earnings through the year, and it would only be by careful 
individual examination, impracticable on any large scale, that 

c 
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an estimate could be made; in many cases the men, even if 
willing, would be quite unable to give a connected account of 
their earnings during the past twelve months. 

It seems clear that we must adopt a smaller unit, and since 
most wages are paid weekly, a week is the most natural one. 
The subsidiary questions which will lead best to an estimate of 
annual earnings will be discussed below. The answer to the 
former question, as to the best quantity to investigate, is in- 
direct ; the only individual measurements we can obtain directly 
are the week's wages, but these may be supplemented by esti- 
mates en masse. 

Next, who possess the information we require? Clearly 
both employers and employed, and in an ideal census the 
BmidQ7on and ^"swers would be obtained from both groups ; 
employed u but considerations of simplicity, cheapness, and 
infennuits. accuracy are all in favour of applying to em- 
ployers alone. 

If employes were to be interrogated the procedure would be 
as follows. Draw up a form on the analogy of the census form, 
describe very briefly the purpose of inquiry, add a short series 
of concise, lucid, simple questions in suitable type and with 
careful spacing, such as will lead to the minimum information 
required ; let these forms be left to be called for, and when 
collected, let the tellers have time and opportunity to examine 
and correct them. It is clear that this method would entail an 
even more expensive organization than the population census, 
and as the result of experiment it may be doubted whether the 
maximum of accurate information that could be thus obtained 
would come up to the minimum that would be of use. A partial 
inquiry could, however, be carried out by means of trade 
unions if they were willing to give serious assistance. 

The method of inquiry among employers was as follows : — 
Suitable blank forms and an explanatory letter were sent by post 
to all employers, whose addresses could be found in the industries 
selected for investigation, and the answers were returned to the 
central office by post. This is far simpler and cheaper than the 
suggested scheme for inquiry among workmen, requiring far fewer 
forms and only a small staff of clerks. With business men it is a 
simpler matter to post the return when completed than to keep it 
for collection by hand. Since there is no personal intercourse over 
the matter it is especially necessary that the questions should be 
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lucid, for the additional correspondence necessary to rectify 
errors is a source of worry at both ends. A copy of one of these 
forms, abridged only in the number of subdivisions, is subjoined 
here and on the following page. 



WAGE CENSUS. 
Ret^jrn of the Rates of Wages Paid in Silk Manufactures. 

Name of Factory or Finn 

Address 



Note. — It b requested that the salaries of clerks and managers may be excluded. 
The return is of wages of working men only. 

Numbers employed on 1886 - - No 

Amount paid in Wages in the year 1885 - - £ 

Highest weekly amount paid in 1885 £ Date 

Number of Hands paid in that week - - No 

Lowest weekly amount paid in 1885 £ Date 

Number of Hands paid in that week - - No 



State the present average rate of pay for overtime : that is, whether 
overtime is reckoned as time and a quarter or time and 
a half, &c., or in what way reckoned . 

State whether overtime is at present being worked, and how much ; 
or whether less than full time, and how much less 
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Current Rates of Wages and Hours of Labour per 
Week of Persons employed in each Branch of the Silk 
Manufactures, on 1886. 



Description of 
Occupation. 


Current Rates of Wages Paid and Number of 
Hours of Labour per Week when in full work, 
but exclusive of Overtime. 

iVi?/^.— State the Number of Hours of Labour per Week, 
whether the Workers were paid by Time or Piece- 
work, and if paid by Piece-work give the amount 
earned in a week, exclusive of Overtime. 


N,B,—li is requested 
that this list of occu- 
pations may be re- 


MALES. 


females. 


vised where necessary. 


Men. 


Lads & Boys. 


Women. 

x8 years and 

upwards. 


Girls. 

Under i8 years. 






<2l 




^1 

9 0< 


Si 


II 


li 




il 




II 


=1 


Silk Throwing— 

Winders - {J?- 
Cleaners - ^1- 
Spinners - Hl^ 
Doublers - {|J™«^ 

&C. 

Silk Spinning— 
Openers and /Time 
Sorters - \ Piece 

Boile. .V^ 

Dressers - {^ 

Preparers and / Time 
Carders - \ Piece 
&c. 

Silk Weaving— 
Winders - ^ime 

Warpers - Hi^e 

Warp Pickers /Time 

or Clearers 1 Piece 

Doublers - /Time 

Tr:ii-«. / Time 
FUlers - | pj^^^ 
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The measurement of the annual earnings of groups of 
workpeople was the ultimate object of the inquiry. Annual 
earnings are composed of many different items, 
of which the following are the most important : — 
Ordinary weekly wages, pay for overtime, special payment 
for 3pecial work (^^., of builders if sent to a distance), or at 
special seasons (such as the harvest) ; and payments not in 
cash, such as free or reduced house-rent, free or cheap coal, 
and special goods at cheap or wholesale prices (such as cloth in 
textile factories, or potatoes for agricultural labourers). 

When payment in kind is at all general or important, it is 
generally better to proceed on a different method entirely, e,g.y 
that followed by the Agricultural Sub-Commissioners of the 
Labour Commission. When it consists of only one simple item, 
such as a house rent-free, it can form the subject of an additional 
question on a form similar to that on p. 35. In the silk industry 
this does not occur ; but this discussion shows the necessity of 
preliminary knowledge on the part of the investigator before 
the right form of inquiry can be drawn up. 

We have left for consideration the weekly wage, and over- 
time and special payments, the last two of which can be grouped 
together. The ordinary weekly wage is a sufficiently general 
and definable quantity in most subdivisions of most industries. 
A foreman could generally state how much is earned in an 
ordinary full week for each of the hands under hipi. In many 
cases there is an hourly or weekly sum regulated by a trade union, 
as in the building trades. In others, as in the cotton industry, 
piece-rates are so regulated as to bring out a definite sum 
for the week's work graduated in relation to the difficulty of the 
task ; in general, a very rapid survey of the wage-book will show 
what the worker in each subdivision will make on an average. 
Thus the average weekly wage in an ordinary full week can be 
found with considerable accuracy, but this takes us only part of 
the way in the calculation of annual earnings ; we need to know 
in addition to this how many full weeks are made in the year. 
It is the method by which this is attempted on the printed form 
that is open to most criticism. The questions used are on 
page 35, and afford a good example of the general difference 
between the qusesita and the data which are attainable. Th^ 
quaesitum is : To how many full weeks* wage are the annual 
earnings equivalent, allowing for slack weeks and overtime? 
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The first crucial question to decide is : Are we to allow for an 
average loss of time, say a week in the year, through sickness, or 

are we to allow only for time lost through failure of 
^^^*^^*^ work? Since sickness is an individual not a general 

misfortune, it will be better to exclude it if possible. 
Now overtime in one season, especially if its wages are on "time- 
and-a-quarter " or "time-and-a-half" basis, very quickly tends 
to balance slack time at another season, though it may be sup- 
posed that it is rarely the case that more than the normal week's 
wage is averaged through the year. Thus it will be logical as 
well as simple to estimate the year's earnings as so many normal 
weeks' wages. For example, if we found that two weeks were 
lost through sickness and three through the mill stopping, and 
that overtime in one busy month had added wages equivalent to 
two normal weeks, we should have forty-nine weeks' full wage. 
The figures which will give this result will be the total sum paid 
in wages in the factory in the year divided by the aggregate normal 
week's wage of the people dependent on the factory, supposed 
all at work. Thus, if 1,200 hands (men, women, and children) 
would, if all at work, make ;^i,ooo in a normal week, and this 
was the average number dependent on the particular mill, and if 
;f48,ooo was paid in the year in wages, annual earnings would 
be equivalent to forty-eight normal weeks, and earnings would 
average ;^40. Now the total paid in wages is generally kept 
separate in business accounts, but the number dependent on the 
mill for work is ofteii not known accurately ; for the personnel 
of a large establishment is subject to continual change, and the 
manager would not know whether a person who left went to 
another mill or got no work. The total number of all who had 
worked there during the year would be too great for this purpose, 
and the number at work in a normal week too small. The 
number open, perhaps, to least objection is the number at work 
in the busiest week of the year ; for those absent except through 
sickness when trade is busy cannot be said to be dependent on 
the factory, but if not at work elsewhere are among the per- 
manent unemployed ; very few workpeople indeed will be 
taking their holiday at a busy time, and it may reasonably be 
supposed that all the factories in the same industry will have 
their busy and slack seasons at nearly the same time. The 
answers then to the printed questions — Total paid in year, and 
number of hands in busiest week — tell us all we need to know, 
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if we may make this assumption ; for then the total sum paid as 
wages in the year, divided by the maximum number employed 
in the busiest week, gives the average annual earnings. To find 
the equivalent number of normal weeks, multiply the maximum 
number employed by the average wage found on the second page 
of the form, so that the product shows the aggregate weekly 
wage if all were employed, and divide the total paid in the year 
by this product. 

In the Cotton industry the sum of the greatest numbers 

employed (if these may be taken as equivalent to the 

numbers employed in those weeks when the ^age bill 

Lotttimeinihe was highest) was, in 1885, 87,887. ;f 3,148,566 

ootton industry, ^^s paid in wages in that year in the factories 

making returns. Average annual earnings were therefore 

QQQ — = ^35- '6s. The average wage in a normal week in 
87,887 

1886 was 15s. 2^±; the product of this and 87,887 is ;£'66,830. 

The equivalent number of normal weeks' work is -^—^^^ = 47. 

66,830 

Hence we may conclude that, if our basis of calculation is correct, 

five weeks was the average lost time at that date. 

This is not the method adopted in the General Report of the 
Wage Census ; there the total paid in 1885 is divided by the 
number employed in a given week in 1886, This number is 
certainly too small, lesg than the number dependent on the 
trade, and as might be expected gives on analysis absurd 
results in some cases. It is to be noticed that the method here 
described cannot be employed in those few industries which the 
employes are able to leave in the slack season in order to earn 
wages in other trades which may then be exceptionally busy. 

Since there is no reason why the number absent through 
sickness should differ in the busiest week from the average 
number so absent, it is clear that the estimate we obtain for 
average lost time (five weeks in the wool industry) is in addition 
to the average time lost through sickness ; this may often be 
estimated from the returns of friendly societies. 

In the corresponding French wage census, of which the 
results were published in 1898,* an Estimate of the number of 
days' work obtained in the year is formed on a different basis. 

* SalcUres et Durh du Travail^ 1897, pp. 15, 16. 
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The data collected were — (i) The variation each month of the 
personnel in each industry, which is found to average 4 per cent 
TiMFrenoii for the year — that is, for each 100 employed, 96 
method. are found who have been in the same establish- 
ment for as much as twelve months : (2) The differences between 
the maximum and minimum numbers employed in each estab- 
lishment month by month during the course of a year, which 
are found to average 19 per cent, of the (? average) personnel. 
From this we may perhaps draw the conclusion that, on an 
average, half this number, at least, are in general out of work : 
(3) The number of different persons who have been employed in 
each establishment at one time or other in the year ; this is 
found to be 140 for each 100 permanently employed, from which 
the legitimate conclusion is that the average number of unem- 
ployed is not so much as 40 in 140, />., 28 per cent These two 
percentages, 9 per cent and 28 per cent., are taken to be the 
inferior and superior limits of average lack of work. This in- 
formation is more detailed and perhaps more reliable than that 
on which the method, used above for the English figures, is 
based. Data obtained from syndicates of French workmen 
indicate about 20 per cent, as the average want of work ; the 
English figures obtained by the method described above from 
the whole wage census yield about 12 per cent 

This somewhat lengthy discussion on the few questions 
included on the first page of the form is a good illustration of 
the necessity of considerable preliminary study before a blank 
form can properly be drawn up. Space does not allow a 
detailed criticism of the rest of the form. 



Digitized by 



Google 



THE WORK OF THE LABOUR DEPARTMENT. 4I 



Section 3.— The Work of the Labour Department. 

The Labour Department of the Board of Trade was founded in 
1893 ; its functions are to collect and publish information, chiefly 
The Labour Statistical, relating to the economic conditions of 
'^•P*'^*'*'*"^ workpeople, and the state of the market for labour. 
Its work lay almost entirely in virgin ground ; new sources of 
information had to be tapped, new methods developed. While 
it was untrammelled by tradition, it could avail itself of the ex- 
perience of the Board of Trade, and was already in touch with 
a widely extended organization. Under these circumstances 
it was soon able to attract a comprehensive and continuous 
supply of valuable information ; and the methods by which it 
accumulates and compiles its statistics should be interesting and 
instructive to all those whose business it is to work in any 
statistical field. 

The figures which are received periodically by the Depart- 
ment are published monthly in the Labour Gazette. Here the 
first article each month is on the " State of Employ- 
ment." As before we must first consider the 
question, What are the exact objects of . the investigation 
of which the results are here published? They are to find 
out how many persons are out of work in each trade and 
district, what percentage they form of all dependent on each 
industry, and how this percentage changes month by month 
and year by year. The next question is, How much of this 
can be discovered, and, if we cannot measure these numbers 
directly, what are the best allied quantities to measure? 
Since no universal register is kept of the unemployed, it would 

Pomue \ seem easier to estimate the number employed, since 
meararemenW. ^n employer can generally state how many work- 
people he has at work at any given time. If we cannot discover 
the number of men at work, we may perhaps be able to find the 
number of machines, furnaces, mines, &c., at work, and deduce 
the number of men employed with them, and thence the number 
of unemployed ; or we may find for how many hours work was 
carried on in a factory or a mine ; or we may even go a step 
further back and find the amount of goods produced, and thence 
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estimate the other quantities ; or we may learn the total amount 
paid in wages. These are the numerical methods ; but there are 
others, useful if not so exact We can obtain reports as to the 

' condition of employment in the various districts or industries, 
not in numbers, as is generally necessary, but with descriptive 
adjectives, — such as busy, slack, improving, much the same, — 
which may lead to numerical estimates, or may serve to check 
results. Lastly, organizations for facilitating employment may 
send in returns of the applications made to them. Nearly all 
these methods are in use at the Labour Department 

Next, who possesses the necessary information ? As regards 
the number unemployed, the only registers kept are those of trade 
unions, to whose secretaries inquiries should be 
addressed The figures so obtained will naturally 
only relate to those sections of an industry where trade unionism 
exists. As regards the number employed, the masters~are the 
authorities, and forms must be sent to them asking the numbers 
at work day by day, or at longer intervals. With respect to the 
number of machines at work, the number of shifts, and the 

^total wages paid, the masters again have the information. For 
the amount produced, the masters, or in some cases officials to 
whom they make returns, can supply the facts. For general 
information as to the state of employment, some presumably 
competent person, in touch with all the factories in an industry, 
or all the trades of a district, must be impressed to forward 
periodical reports. The Labour Department is in touch with a 
great number of such correspondents, many of them connected 
with the trades councils of their towns. 

The question as to whether the information will be given 
impartially and willingly need not detain us long in this case ; 
for, generally speaking, the returns are simply automatic copies 
or registers of known numbers, and would only be partial if 
wilfully falsified ; and since the returns are made periodically, 
the persons concerned regard them as a matter of course, and, 
once they have commenced, continue willingly to forward the 
requisite figures. 

By the courtesy of the Labour Department I have been able 
to obtain copies of most of the forms in use. There are some 
forty in all, each suited to some special industry or method of 
investigation. 

It must be remembered that the Labour Department had to 
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form its own intelligence organization, and initially was obliged 

Tiieftomatioii ^^ apply to persons able to give information, just 

ofanmteuigwioeas any private investigator would. A connection 

departmeiit. y^^^ therefore to be established with trade unions 

and other societies, and with manufacturers; and, when a nucleus 

had been formed, continual efforts were necessary to extend the 

organization in all directions. One or two of the circular letters 

written for this purpose are given on the following pages, since 

they are typical of the method which investigators must employ 

to enlist the help of possible informants who are uninterested. 

The points to notice are: — (i) The statement of the exact 

purpose for which the information is wanted ; (2) the simple 

and explicit direction as to what is to be done by the 

informant ; (3) the undertaking that the information will not 

be used in any way that can do, or appear to do, him injury. 

Here is one of the earlier letters, opening a connection : — 

Labour Department, 1894. 

Dear Sir, — The Labour Department of the Board of Trade, which 
is charged with the duty of collecting periodical statistics as to the. 
condition of the Labour Market, is desirous of obtaining fuller informa- 
tion from month to month with regard to the state of employment in the 
Pig Iron Industry. For this purpose, the Department would be glad 
to receive monthly information from a large number of the employers 
in the United Kingdom as to the number of furnaces in blast and the 
numbers of workpeople employed, on the average, at each furnace. 

I shall accordingly be glad if you will be kind enough to assist the 
Department in making this inquiry complete by filling up and returning 
to me de/cre the 4th of May the enclosed form. Postage need not be 
prepaid if the reply is addressed to " The Commissioner for Labour " 
at the address given above. 

The results of the inquiry will not be published in such a form as to 
render possible the identification of particular returns, — ^Yours, &c. 

When the Department had organized its work, and tabulated 
and published some of its returns, the next step was to endeavour 
to achieve completeness. When many are known to have given 
information, the more cautious will be encouraged, the less ener- 
getic be ashamed to be less public-spirited than their neighbours, 
and the critical anxious to correct mistakes. The first of the 
following letters, which is used for general purposes, takes ad- 



Digitized by VjOOQIC 



44 ELEMENTS OF STATISTICS. 

vantage of these tendencies, and the second is another excellent 
example of the method of extending the organization : — 



1895. 

Dear Sir, — I am forwarding herewith a copy of the "Labour Gazette" 
.for the current month, and beg leave to draw your attention to the article 

therein dealing with the state of employment in the 

The Labour Department is very desirous of making the information 
contained in these monthly reports as complete as possible, and trusts 
you will kindly assist by filling up and returning the enclosed form. 
You will notice that the form is of a very simple kind, and one that can 
readily be filled up without much trouble. 

/ may add that Returns are regarded as strictly confidential and are 
only used to produce general statistical results in which the identity of 
individual returns is lost, — I am, &c. 

March 1895. 

Dear Sir, — This Department has for some time past received 
monthly Returns, both from the Dock Companies, and the Ship-owners 
who do their own unloading work in the port of London, with regard 
to the number of Dock Labourers employed. These Returns are 
collected with a view to throwing light on the periodical fluctuations in 
the employment of this class of labour ; but the figures are published in 
a general total and not in such a way as to make possible the identifi- 
cation of particular firms supplying the information. The article on 
page 36 of the enclosed Labour Gazette will show you the use made 
of the Returns. 

Hitherto no exact information has been obtained with regard to 
employment of labour at the wharves, and you will readily see that the 
addition of such information would very greatly increase the value of 
the statistics. The managers of several of the most important wharves 
on both sides of the river have been good enough to promise to make 
monthly Returns; and I should be greatly obliged if you could see 
your way to assist the Department by supplying the information speci- 
fied on the enclosed form, not later than the date there indicated. 

You will observe that a form is provided for the daily number of 
labourers employed, and, alternatively, for the average weekly number. 
The daily number would, on the whole, be the most useful for the pur- 
poses of the Department ; but if for any reason you cannot see your 
way to supply such detailed information, a weekly average would be of 
value. — lam, &c. 
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Another letter may be given as serving to encourage those 
who have already engaged in the good work. It will be noticed 
that, though now more concise, it is still insinuating. 



Agricultural Labour in January, 

Dear Sir, — I am instructed by the Commissioner for Labour to ask 
you to be good enough to favour this Department with replies to the 
questions on pages 2-4 of this form, by Friday, 4th February 1898. 

I beg at the same* time to thank you for your kindness in send- 
ing answers to questions put to you by the Department on former 
occasions. — Yours, &c. 

These letters are well worth noticing because they have 
assisted to build up a very efficient organization for information 
out of nothing, and have succeeded in eliciting answers from 
uninterested men of business, who are not given to spending 
time and trouble on unremunerative labour. 

Please forward this Return to the address on the back not later than the 
Fifth 0/ the month succeeding that to which it relates. No postage 
need be paid. 

Return of State of Employment 

in Month of. 189 

Name of Society 

Total number of members in Society at close of month 

Number receiving out-of-work pay in last week of month 

(Do not include members on strike or locked out.) 

State, if possible, number of members entirely unemployed but not 
receiving benefit in last week of month 

State of employment for month - — 

If any dispute, change in wages or hours of labour has occurred, please 
say, and the necessary forms will be forwarded at once 

Remarks — 



Signed- 



Secretary, 
Date T 89 
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The form given above is that issued to trade-union secre- 
taries, and it is by its means that the only perfectly definite 
measure of want of employment is obtained. It 
should be remembered that it is filled in monthly 
by the same official, and requires no special explanation. Since 
the first attempt to draw up such a form is apt to present 
difficulties, this may be noticed in detail. ' First, we find an in- 
struction as to the way it should be returned. Most forms are 
provided with a printed envelope to save trouble and mistakes, 
and postage is paid by the investigator. Next comes a brief 
Bzaminauon of '^^^^^"S ^"^1 the date, and then the name of the 
luwmiiiQyinont society, an item used for reference and further 
^^*'''** inquiry, but not for publication. Now we need 
to know chiefly the percentage unemployed, but secondarily 
the total number. The questions most easily answered should 
be asked, and the calculations done at the central office, for the 
trade-union secretary may make arithmetical mistakes. Again, it 
is not the numbers day by day that are asked, for they are hardly 
known ; nor week by week, which would give trouble ; nor even 
the average for the month, for that might lead to guess-work ; 
but a definite day or week is decided on, the same for all trades. 
For purposes of comparison trade with trade, or month with 
month, this is found sufficient. 

We notice next an important point connected with the defini- 
tion of Unemployment, and also an illustration of the necessity of 
Definition of Studying the figures at their source. It is not 
nnottidoTsient. stated explicitly in each Gazette whether men on 
strike or locked out are included as " unemployed " or not A 
reference to this form shows that they are not included, and, 
therefore, before conclusions are drawn as to the amount of 
work obtained year by year, the excellent statistics relating to 
labour disputes given in another part of the Gazette must be 
studied. 

All members of trade unions out of work do not at once 
receive " benefit," the technical term for any payment of union 
funds, and therefore a correction must be made in some cases 
for those who have not yet come " on the society." The 
number is likely to be known accurately to the local secretary ; 
but if the form had to be filled in at a London office, say, for 
the whole Amalgamated Society of Engineers, the number 
would not there be known. At this point we are left in some 
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doubt as to the methods of the Department ; for we are not told 
whether these forms are sent to all local secretaries, or whether 
they are filled up centrally for .whole districts, or what additional 
information is obtained from other sources. 

The next line is for a qualifying adjective which will serve 
to check labour correspondents' information, and to indicate 
whether the last week was typical of the month. 

When an organisation is ready for a special purpose, any 
secondary use may be made of it that will not vitiate its chief 

sniMidiary end. The Labour Department is always anxious 

mfomiAitoiL to hear of all changes of wages, and in general has 

to detect their existence for itself; hence no opportunity of 

obtaining such information is lost, and this widely circulating 

form is used for the purpose. 

Lastly, if the informant is an intelligent man and acquainted 
with the methods of the Department, it is well to give him an 
opportunity of adding any relevant remarks that may occur to 
him. On the line " Remarks " might be given some reason for 
any exceptional numbers or information as to trade prospects 
which might furnish a clue to the Department for other investi- 
gations. The paper is signed and dated to show that it has 
been filled in officially and at the right time. 

These forms are not always filled in and returned to date ; 
sometimes special application has to be made for them, and 
occasionally the necessary numbers have to be interpolated from 
other sources. 

The next form given is more complicated, and illustrates 
two of the methods mentioned, finding the number of persons 
employed and the number of days worked. 
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Please fold and return this form by the 29th December to the address 

given on back, 

EMPLOYMENT AT. COAL MINES. 

Individual Returns are regarded as confidential and not published 

separately. 

County in which Pits are situated 

Name of Firm or Company 

Postal Address to which form should be sent 



Number of shifts usually worked in each 34 hours 



Names of Pits or Seams. 



No. of " Other Workpeople " 

feneral to all or several 
its and not included 
above, and Number of 
Days worked by them in 
four weeks - - - 



State whether 

raised was 
"House," 
"Steam," 

•• Gas,' 

*' Manufac* 

turine^" or 

"Coking" 

Coal. 



•:j 



•Number Of 

Workpeople 

paid on last pay 

day in four 

weeks ending 



25th 
Dec. 
1897. 



No. 



26th 
Dec. 
1896. 



No. 



t Number Of 
Dayi 

on which 

Coal was 

hewn and 

wound at the 

Colliery in 

four weeks 

ending 

December 95, 

X897. 



Days. 



{Short Tima. 

If any of 

the days stated 

in Col. s were 

shorter than 

usual, please 

state in Col. 6 

the total amount 

of time to be 

deducted by 

Labour Dept. in 

four weeks. 



Days. Hours. 



* The number of Workpeople should include all Men and Boys, &c., 
employed in and about the Pits, except Clerks and Managers. The number 
should also include " Drawers " and others who may be paid by " Hewers." 

t The number of Days^ whether full or not, on which Coal was "hewn and 
wound" should be inserted in this column. If on any of these days short 
shifts only were worked, the extent of the time lost should be stated in 
Column 6 ; but it should be left to the Labour Department to deduct from 
the figures in Column 5 the Short Time, if any, given in Column 6. If the 
time worked on Saturday is usually shorter than on other days, no reduction 
should be made on that account. 

\ Short Time. — Please state here any special reasons for Short Time :— 



Note, — A General Summary of the Returns received will appear on 15th January in 
the Labour Gazette, which can be ordered through any Newsagent, price id. 



Digitized by V^jOOQIC 



THE WORK OF THE LABOUR DEPARTMENT. 49 

The form has one or two peculiarities. A colliery company 
has often several pits, so that in the first place it is not obvious 

FMniiaitUM at which address this information can be most 
aftoniL readily given, while it is important not to waste 
time and trouble in forwarding from office to office ; and, in the 
second place, not only will work be done for different lengths of 
time in different pits, but also there will be variation from seam 
to seam in the same pit This was not recognised in the first 
form sent out, but a second had to be sent distinguishing the 
pits and asking for subsidiary information. In this form may 
be seen the modifications that must be introduced to suit the 
questions to particular industries. In a colliery the factor which 
determines the state of employment is generally not the numl^r 
employed, but the number of days' work, the number of days 
" coal is wound.'* A colliery at full work may make four, five, 
or six days a week, or eleven days a fortnight (leaving one day 
a fortnight for repairs), according to the custom of the district 
and the state of the trade, and there may be two shifts or three 
in the twenty-four hours. If work is slack, the number of shifts 
per fortnight, which is really the essential quantity to know, will 
be diminished, and the alteration will very likely affect all em- 
ployes equally. Again, since the colliers are not all at work at 
once, the question is not " how many are at work ? " but " how 
many are paid?" the pay-day, once a fortnight, or however 
often it may be, being perhaps the only time when all the 
workers are together. The number at pay-day is, therefore, the 
number employed in the mine, a quantity varying as new seams 
are opened or old seams worked out, and the number of days • 
on which there is work is the factor which determines the' 
amount of work obtained per workman. Notice that the ques- 
tions, number of shifts (per day, number at pay-day, and days at 
work, are precisely those which the manager will find easiest to 
answer. Since, however, days are of different lengths, depending 
on the demand for coal, the good working of machinery, the 
presence of the necessary trucks, and the efficacy of the railway 
arrangements for clearing the yards, and other circumstances, it 
is necessary to know whether on any working days, winding 
stopped early or the shift was shortened ; hence the question in 
column 6, which will give the manager more trouble. 

In the form relating to dock labour, the question is simply, 
how many are employed, not at the end of the month, but 

D 

Digitized by V^jOOQIC 



so ELEMENTS OF STATISTICS. 

day by day; for labour at the dock fluctuates violently and 
continually, as may be seen frpm the monthly diagram in the 

Fotnifiiroihttr Labour Gazette. On that relating to the Surrey 
indnstriet. Commercial Dock, the question is again modified 
to suit, it may be supposed, a special method of bookkeeping, and 
reads, " What is the amount of wages paid at the end of each 
week?" Wages are perhaps a better measure of dock labour 
than number employed since the number of hours worked varies 
continually, men being taken on for long or short hours, but the 
rate of pay varies little. On both forms there is a question 
as to any special holidays or other events affecting work. On 
the form sent to pig-iron works, the question asked is as to how 
many furnaces are "in blast/' or have been "blown out," or 
re-lit; on that relating to steel, iron, and tinplate works, the 
information required is " the number of shifts " worked in four 
weeks. Another form is to be filled up by a single correspon- 
dent for a wide district, and the returns are entered under the 
headings — Number of mills (i) running full time and giving full 
employment; (2) running full time, but giving only partial 
employment ; (3) running short time ; (4) stopped. 

Another instance of adaptation of the form to a particular 
industry is afforded by the inquiries as to agriculture. In this 

BmpioymaatiA case the number of employers is very great, they 
agximiitiin. ^re very much scattered, and little used to statisti- 
cal inquiries, and the labourers are for the most part uncombined. 
On the other hand, in the majority of villages agriculture is the 
predominant industry, every one knows all about every one else, 
and any one intelligent person can give an accurate account of 
the state of labour in his district. It is necessary then to arrange 
with a labour correspondent, a farmer, or a member of the Village 
Council, or the chairman of the District Council, and to apply to 
him monthly for information. 

Only one general organization is necessary for the collection 
of the three groups of figures wanted by the Department for all 
industries. These groups relate to the state of employment, which 
fluctuates continually ; changes of wages, which in some cases 
take place at stated times, in others occur irregularly; and strikes, 
which may begin at any time and last a long while. In the 
case of agriculture, the three groups of questions are placed on 
a single form, though the practice has changed a good deal since 
1893. 



Digitized by 



Google 



THE WORK OF THE LABOUR DEPARTMENT. 5 1 

One form, that in use in 1894, asks for complete details as 
to wages and the number employed at the harvest, with a page 
devoted to strikes, and two spaces for remarks on the weather 
and on things in general. 

The next, July 1895, deals with haymaking, strikes, and 
wages. The questions here are as follows : — 

1. Were there any able-bodied agricultural labourers in irregular work 

in your Parish during the month of June ? 

2. If you answer question i in the af&rmative, can you give the numbers 

and state about what proportion those in irregular work were of the 
total number of able-bodied agricultural labourers ? 

3. If you can give the particulars asked for in questions i and 2 for any 

neighbouring Parishes, kindly do so. 

r$ 

4. What daily or weekly wages are being paid in the district to the 

regular farm hands during haymaking ? Also state how much is 
paid for overtime and what perquisites are given, such as food, 
beer, &c. 

5. What daily or weekly wages are being paid in the district to extra 

hands during haymaking ? Also state how much is paid for over- 
time and what perquisites are given, such as food. 

6. Were there any agricultural strikes in your neighbourhood during 

June ? If so, please give the following particulars with reference 
to each strike : — 

(i) The date; (2) The cause; (3) The duration; (4) The 
result ; (5) The number of men affected. 

There are differences in the forms for nearly every month in 
the year ; and the questions have been modified as experience 
suggested till they are finally as follows ;— 
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Union Parish. 



L— STATE OF EMPLOYMENT. 

1. Approximate number of able-bodied agricultural labourers in 

Parish. 
(If this question has been recently answered by you, you need not repeat your reply.) ^ 

2. Were there any able-bodied agricultural labourers in irregular work 

in your Parish during the month of January 1 898 ? 

3. If so, can you say about how many were in irregular work in the 

last week of January 1898 ? 

4. Was employment more regular in January 1898 than in January 

1897? 
If you can give the above particulars for any neighbouring Parishes, or 
for the whole of the Poor Law Union, kindly do so. 



II.— CHANGES IN RATE OF WAGES IN 
JANUARY 1898. 

(3iANGES IN Weekly Cash Rates of Wages of Ordinary 
Labourers in January 1898. 

iJ^.B, — Ordinary labourers do not include foremen, shepherds, cattle- 
men, carters, waggoners, teamsters.) 



Locality in which Change took place. 
(State whether Change applies to 
the whole County, or to which 
Poor Law Unions or Parishes 
within it.) 


Approximate Num- 
ber of Labourers 
who have had a 
Rise or Fall in 
Wages in Janu- 
ary 1898. 


Cash Rates of 
Wages per Week. 


Please state in this 
column for com- 
parison what the 
Rate of Wages 
was in January 
X897. 


Before 
Change. 

s.d. 


After 
Change. 




• 


X. d. 





Name of Correspondent _ 
Postal Address^ 
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For convenience of printing the exact spacing allotted for 
answers has not been introduced in these reprints. In the 
agricultural forms a great many square inches are allowed for 
such an answer as " Yes '* ; in the others the space is allotted in 
proportion to the amount of information expected. The ques- 
tions as to wages on this form will be alluded to presently. 

The information collected by the Department as to trade 
disputes is detailed and important. The principal questions 
in the investigation relate to the causes of dis- 
putes, the methods of settling them, and the total 
loss of money to workpeople and employers. Of these the 
first two are not statistical questions, but are inserted because 
the inquiry has three objects: — (i) A general examination 
of the causes of and remedies for strikes ; (2) an inquiry as 
to the course of each particular dispute, so as to bring the 
Conciliation Act into operation if possible, or by disseminating 
information to assist an arrangement or compromise; (3) the 
collection of statistical information. 

The Department is dependent on its own alertness for 
knowledge of the existence of disputes, and its chief sources 
of information are the daily press (London, Provincial, and 
Trade) and special local correspondents, who are expected to 
inform the Department, directly work is stopped owing to a 
dispute, on a special form. 

As to the question, Who knows the facts? obviously the 
only people are the employers and employed ; and since they 
may take different views on all subjects connected with the 
dispute, both parties must be addressed. In this investigation 
partiality and bias in" the answers will be at a maximum ; the 
questions must be restricted as far as possible to facts about 
which two opinions arie nearly impossible, and any questions 
which will not be answered willingly should be omitted. 

On pages 54, 55 is given the form sent to Trade Unions in 
1895, on pages 56, 57 that used in 1897, and on page 58 the 
letter accompanying the latter. 
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LABOUR STATISTICS.— 
Questions as to 



Questions. 



6. 



lO. 



II. 



12. 



1. Name of employer, firm or company, and trade - 

[Where more firms are involved than one, or the strike 
or lock-out has been general over a locality, the number 
of employers or firms to be stated as nearly as possible.] 

2. Cause or object of strike or lock-out - 

3. Whether strike was ordered or approved by trade 

union 

4. Date of commencement and termination of strike 

or lock-out 

5. Result of strike or lock-out . . - - 

[If dispute has been respecting increase or reduction of 
wages or hours of work, state exact amount of increase or 
reduction (if any).] 

Mode of settlement 

7. Number of persons affected . - . - 

(i) Number directly on strike or locked out 

a. At beginning of dispute 

d. At end of dispute - - - - 
[Distinguish between adult men and women, and 
apprentices or other yonng persons.] 

(2) Number employed in factories or works 
where strike or lock-out occurred, and who 
were thrown out of work thereby, but were 
not directly on strike or locked out - 
Estimated tdtal amount of wages earned in a full 
week (exclusive of overtime) by those affected 
immediately before and after strike or lock-out 
a. Directly affected - - - - 
d. Indirectly affected - - - - 
Number of those on strike or locked out who 

belong to trade unions 

Amount expended in support of persons on 

strike or locked out 

a. By union 

d. By other strike fund 
Number of persons who " went in " or returned 

to work before termination of the dispute 
Please suggest means of settling or preventing 
labour disputes 



8. 



Answers. 



2. 



4. Date of commence- 
ment 

5- 



6. 
7. 



8. Before Strike or 
Lock-out. 

a. 



10. Amount per head 
per week. 



12. 
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STRIKES AND LOCK-OUTS. 
Strikes and Ix»ck-outs. 



55 
1895. 



Answers. 


General Observations. 




I. 
2. 




Date of termination. 


3- 

5- 

6. 
7. 


I university) 


After Strike or Lock-out. 


8. 




a. 






b. 


9- 




Total amount expended. 


10. 

II. 
12. 
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Information for the use of the Labour Departntent^ Board of Trade, 44 Parliament St., S. IV. 

STRIKES AND LOCK-OUTS. 

Part I. 
[To be forwarded as soon as possible, without waiting for settlement of dispnte.] 



Questions. 


Answers. 


1. Name of Trade affected - 

2. Number of Firms involved 

[If an Employers' Association 
is concerned in the dispute, 
please give the name and ad- 
dress of its Secretary. 

If there is no such Association, 
please give the names and ad- 
dresses of the principal firms 
involved in the dispute.] 

3. Cause or object of strike or 

lock-out . - - - 
{Enclose copy of any application 
or Notice connected with the 
origin of the dispute.) 

4. Date of the first day on which 

the workpeople were absent 
from work through strike or 
lock-out. 
(If notices were handed in, give 
also date of notice.) 






Occupations. 


Men. 


Women. 


Apprentices or other 
Yoong Persons. 


5. Slate occupations and numl^er 
of workpeople (Unionists and 
Non - unionists) directly on 
strike or locked out. 

5a. State occupations and number 
of other workpeople (Unionists 
and Non-unionists) employed 
in above establishments who 
were thrown out of work owing 
to the strike or lock-out. 


! 








Total Number of workpeople 
affected* - 











* If any other workpeople were affected, respecting whom you can state no exact figures, 
please give, if possible, the name and address of some person who could do so : — 
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ItrfomuUionfor the use of the Labour Department y Board of Trade, 44 Parliament St., S, fV. 

STRIKES AND LOCK-OUTS. 
Part II. 
[To be forwarded as soon as the dispute is terminated.] 1897. 



Questions. 



Answers. 



6. Date of termination of strike or 

lock-out, t.e.fihe last week-day 
on which the workpeople were 
on strike or locked-out, or the 
date when the places of the 
strikers were filled up. 
(If there was no definite end to the 
dispute, please state approximately 
when it may be regarded as practi- 
cally closed. 

7. Result of strike or lock-out 
(Enclose copy of any printed or 

written agreement that may 
have been made.) 

8. Describe the steps taken which 

resulted in the settlement, 
giving the names of any or- 
ganizations or persons who 
assisted in bringing this about. 



If the result involved a Change in the Rate of Wages or Hours of Labour, give 
the following particulars for all workpeople whose wages or hours were changed, 
whether Strikers or net : — 



Occupations 

affected by Changes 

in Wages or 


Number of 

Workpeople 

whose Wages or 

Hours were 

changed.* 


Date from 

which Change 

takes effect. 


Rate Of Wages t 

. in a Full Week, 
exclusive of overtime. 


Hours of Labour 

in a Full Week, 

exclusive of meal times 

and overtime. 


Hours. 


Before 
Change. 


After 
Change. 


Before 
Change. 


After 
Change. 

















* This b not necessarily the number on Strike or Locked out. 

t When there has been a change in piece rates, please ^ve the percentage increase or decrease in piece 
oximately the average earnings in a full week (exclusive of overtime) before and 



prices, and approximate!} 
after change. 



Signature- 



Address^ 



Date. 
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Labour Department, Board of Trade, 
44 Parliament Street, London, S.W., 1897. 

Dear Sir, — The Labour Department of the Board of Trade is 
desirous of obtaining a complete and accurate record of Strikes and 
Lock-outs, and Changes in Rates of Wages and Hours of Labour in 
the United Kingdom as they occur, for publication in the Annual 
Reports presented to Parliament, and also in the "Labour Gazette," 
which is issued monthly. 

These statistics are collected and published by the Department in 
pursuance of the following Resolution adopted by the House of 
Commons on the 2nd March, 1886 : — 

'*That in the opinion of this House immediate steps should be taken to ensure 
in this country the full and accurate collection and publication of Labour 
Statistics." 

As the value of these statistics is greatly increased if the parties 
concerned co-operate with the Department by supplying accurate 
information, I should be glad if you would kindly answer as many as 
possible of the questions asked on the inner pages of this form so far 

as they relate to the 



If from any cause you are unable at present to answer the questions 
on Part H. of the form, will you be so good as to fill in and return 
Part L at once, and send Part IL as soon as it is possible to do so. 

I have to add that any information you may be good enough to 
furnish will be used solely for statistical purposes, and will not be 
published under your name. 

A circular asking for similar particulars is addressed to the employer 
affected by this dispute. — Yours faithfully, 



Chief Labour Correspondent, 
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The letter given with the later of these forms is a particularly 
careful one, showing the object of the inquiry, promising secrecy, 
and guaranteeing an impartial survey by the statement that 
similar forms are sent to employers and workmen. The 
forms addressed to employers are precisely similar in general 
appearance. 

The main difference between the forms is that the later is 
divided into two parts, the first of which can be filled up directly 

Giuuigeof work is stopped by a dispute, so as to give the 
*™- Department a clue as to its magnitude and cause. 
The second part is detachable, and is to be preserved till the 
dispute is ended, and then forwarded. The advantages of this 
method are that the Department has early information as to 
the exact facts about the strike, and that the figures are given 
while the facts are fresh in the mind of the informant, whereas 
at the end of a long struggle, they might have been forgotten. 
Should the second part not be forwarded, the Department would 
of course write for it, or send a duplicate. 

Question i on the earlier form is modified and split into two 
on the second. Question 2 on the later is simpler than the 
parenthesis of question i on the earlier, but asks for the more 
important information as to employers' associations, which will 
lead to the blank schedule being sent to the addresses given. 

Question 2 of 1895, 3 of 1897, is the same on all four forms 
(the two to trade unions, and the companion forms to employers); 
it is not a statistical question, and probably leads to vagueness 
and to contradictory statements on the part of employers and 
employed; but the new parenthesis ("enclose copy, &c.") is 
important, for it leads to definite statements about which there 
can be no dispute. 

The next question on the earlier form had of course to be 
altered for the new double sheet. Since the chief statistical 
information needed relates to the exact number of days' work 
lost, it is necessary to know exactly the date of the commence- 
ment of idleness ; this day is therefore very carefully defined on 
the later form, as not that on* which notices were sent in or any 
preliminary steps taken, but that of the actual commencement of 
hostilities. Question 6 (date of termination) is also carefully 
worded. The date of notices (question 4) gives useful sub- 
sidiary information. 

There is considerable difference between question 5 and S^ 
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on the later and the corresponding question 7 on the earlier. 
The only difficulty in using the information 

^ * obtained from the new form arises at this point 

The number affected by a strike, especially the number indirectly 
affected, changes continually, rising gradually to a maximum and 
then rapidly decreasing as the dispute draws to a close. The 
1895 form did not give enough information, for the numbers at 
intermediate dates cannot be deduced from the numbers at the 
beginning and at the end, so that we have not the necessary 
data for determining what we chiefly want, the number of days' 
work lost (t\e., the sum of the numbers of days lost by each 
person affected). In the case of a long dispute, however, this 
information is revised monthly at least, as is shown by the 
monthly report in the Labour Gazette, 

The chief improvement in the new form consists in 
allotting separate spaces for different occupations. Several 

The ipraadia^ classes of workpeople will probably be affected in 
of ft ittike. different ways by a strike in a complicated in- 
dustry. Thus if the cotton spinners are on strike, very likely 
the carders will go out either on a grievance of their own or 
from sympathy. The spinners* assistants, the piecers, are at 
once thrown out of work, as are also the overlookers of the 
mules. As the strike continues all the departments of the 
spinning mill will be closed, one after the other. In the form 
four lines are allowed for those directly affected, eight for other 
classes unwillingly on strike. A great dispute, however, is not 
limited in its effect to the spinning mills. The supply of yarn 
falls off and the weavers are stopped ; then the export trade is 
diminished, and dock labourers and sailors are thrown out of 
work, and so the influence of the strike spreads. It is out of 
the question to estimate completely these indirect effects ; but 
in order to trace them as far as possible, space is given on the 
second form for the address of any one who can give information 
about them. 

Question 6 on the earlier form, "Mode of settlement?" has 
been expanded considerably in the later one, since the question 
cannot well be answered in a single word, and the exact details 
are important for the non - statistical part of the inquiry. 
Question 5 in the earlier has also been altered ; the important 
request for printed agreements is added, and the parenthetical 
part has been grouped with question 8 so as to form the new 
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question 9. This alteration, the same on employers' and work- 
men's forms, is worth special attention. The distinction between 
" directly " and " indirectly affected " is practically useless, and 
difficult to maintain in filling up the form. It is far more im- 
portant to distinguish the different classes of workpeople, as can 
be done in the nine lines of the new form. Again, it is difficult 
to state the "total wages before and after," and the question 
leads to inaccuracies ; the new question 9 is far better, for it 
is precisely that easiest to fill in, and most useful when done, 
and is in the exact form wanted by the Department for its 
register of changes of wages and hours. It is important in 
this question 9 to include all, whether on strike or not ; hence 
we have the italics in the heading and a footnote to the second 
column. This footnote could be improved, for at present the 
wording is a little obscure, and the notice might be put with 
advantage in the heading. 

There remains a series of questions which have been dropped 
out in the later form. It may be supposed that it was found 
that the answers were not accurately given, that 
the inclusion of the questions overloaded the form, 
and by tending to inaccuracy in the answers led to inaccuracy in 
other details; while in cases when it was possible to obtain 
correct answer^, it was found best to do so by other methods or 
a separate inquiry. There are two sets of questions: trade 
unionists are asked the amount they spent ; employers the value 
of capital left idle. 

Question 3 on the older trade-unionist form has nothing to 
do with the statistical inquiry. Question 9 simply affects the 
relation of unionists to non-unionists, may lead to exaggerated 
answers, is not wanted for tabulation, and is apart from the main 
inquiry. Question 10 belongs properly to a separate inquiry ; the 
total might not be known to the secretary who fills in the form, 
and the amount expended by unions on "strike benefit" is com- 
piled annually from other sources. The question is too compli- 
cated to be placed with advantage at the end of a long form. 
Question 11 is too vague to lead to the information wanted, 
though knowledge of the facts is needed. Question 12 is hardly 
likely to yield any results worth having, since all possible means 
have long been canvassed. The answers are, however, tabulated 
in the Report on Strikes of 1894 (C. — 7901), which may be studied 
with advantage in connection with these forms, and some of 
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them may be quoted : — " Give in to the wants of the men, so 
that they are not extraordinary." " Abolish capitalism." " No 
means have yet been discovered." " Make all men Christian." 
" Fair argument." " A little more common honesty on the part 
of employers " (pp. 229-240). 

The questions omitted in the later, but present in the earlier 
employers' forms are subject to similar criticisms. The sub- 
division of question 9, distinguishing summer and winter wages, 
and the separate columns for hours, are only in the later form. 
A comparison of these two forms with any number of the Labour 
Gazette and the Annual Report on Strikes and Lock-outs of 1894 
will throw considerable light on the uses and difficulties of 
forms of inquiry. 
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Section 4. — Statistics of England's Foreign Trade. 

The original schedules which lead to many other statistics 
are interesting, but limits of space must restrict us to one more 
typical inquiry, that which leads to our statistics of foreign 
trade. 

In the population census the filling in of the form is com- 
pulsory and done by the householder ; in the wage census 
the answers were voluntary and given once and for all by the 
employer; in the various inquiries undertaken by the Labour 
Department the answers are voluntary, but in many cases 
f>eriodic, so as to become quasi-official. The method of collec- 
tion of import and export statistics is a blend of all these. 
^ ,_^ ^ There are three classes of persons who know the 
facts in question — the sender of the goods, the 
custom-house official through whose hands they pass, and the 
recipient or his agent. Circumstances decide that, in the case of 
exports from the United Kingdom, the exporter or hi^ agent 
sends an account of the quantity and value of goods de- 
spatched to the Statistical Office of the Board of Trade ; that, in 
the case of imports, the receiving-agent hands over an account 
of goods to be landed to the custom-house officials, who verify 
the account, roughly if the goods are duty free, carefully if they 
are liable to duty ; and that, in the case of transhipment, the 
goods are treated in the same way as imports at the port of 
landing, and to some extent verified at the port of embarkation. 

The blank forms, being filled in by officials as part of 
their duty, or by agents thoroughly used to the task, need no 
covering letter, and may be made as complicated as necessary ; 
no questions are inserted but only blank tables. An examina- 
tion of the forms in use will show what are included as exports 
and imports in the Board of Trade totals, and what is the total 
amount of information available for tabulation. 

The quantities we wish to measure in this investigation are : 

the volume or weight and value of all goods which have an 

.ThecruBfita exchange value, which leave our shores or reach 

anddataw them from without, subdivided as regards classes 
of commodities and countries of destination or origin ; the values 
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being those at the times of loading or unloading. The quanti- 
ties we can measure are sharply distinct from these, being the 
records of values and volumes which reach the Board of Trade. 
We should therefore examine the forms to decide — (i) What 
part of imports and exports are recorded ; (2) whether the values 
are correctly given, (3) the quantities accurately registered, 
(4) the commodities accurately defined, (5) the countries of 
origin and destination accurately distinguished in the returns. 
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On reaching port the ship's master has to send in an 

BzampiMof account, of which the following is an abridged 

specimen : — 

If Sailing Vessel 
or Steamer 



Infoniiatloii. 



No. I. 
Port of X. 



STEAMER. 
REPORT No. 980-* 



Official No. 
No. of Register, 
Date of Registry, 





Tonnage. 


British or Foreign. 

If British, Port of 

Registry ; if Foreign, 

Country to which she 

belongs. 


Number of Crew. 


Name of Master, 

and whether a 

British or Foreign 

Subject. 


Port or 
Place from 


Ship's Name. 


British 
Seamen. 


Foreign 
Seamen. 


whence 
arrived. 


Marianne. 


7cx> 


BRITISH. 

Total.. 


12 


— 


H. Hind 


Havre, 
France. 















Cargo. 






X. 


9. 


3- 


Packages and Description 


5* 

Particulars of 


6. 


7. 


Name or 






ofGoods.PanicuUraof 


Packages and 
Goods (if any) for 


Goods (If any) to 




Names of 


Marks 


Nos. 


Goods stowed loose, and 


be Transhipped 


Name of 


Places where 






Contents of each Package 


any other Port in 


or to reroam on 


Consignee. 


laden in order 
of time. 






of Tobacco, Cigars, or 
Snaffintendedtobe 
imported at this Port. 


the United 
Kingdom. 


Board for 
Exportation. 


• 


Havre, 


Pari 


s to 


London.— 600 pkgs. 


Fruit and Peris 


hables. 


Smith. 


France. 


COK 
AE 
KG 

FOT 

AC 


1392 
495/6 
340/9 

I 
10 


68 pkgs. Merchan 


dise. 








KL 


40 


-70 cases Wine. 






yy 


If any wreck 


ACD 


20 










fallen in with 
or picked up, 
to be stated. 


WD 
O&D 


166 

I 


5 cases Woollens in 
I case Brandy. 


transit to Liver 


pool. 


If 



Stores. 



Surplus Stores remaining on board, viz. I | ., * TcS^co 



Number of Alien Passengers (if any) 
Pilot's Names 



Nil. 



At what Station Ship lying • - - South Quay. 
Agent's Name and Address • - - C. J. C. 
I declare that the above is a just report of my Ship and of her Lading, and that 
the Particulars therein inserted are true to the best of my knowledge, and that I have 
not broken Bulk or delivered anv Goods out of my said ship since ner departure from 
Havre, the last Foreign Place of Loading. 

(Signed) 
Signed and declared this 13th day of October 1890 
In presence of 

(Countersigned) 
Coliectar, 



H. HIND, Master. 



* I'.tf., 980th ship at X. since xst January* 
£ 
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The goods for quick transit are passed at once, and a special 
form is sent to the Board of Trade similar in character to that 
on p. 67. The remaining goods are treated cither 
as dutiable or as duty-free articles. In the list 
before us, ten cases of wine are entered for home use, and an 
account is sent in to the Statistical Office ; sixty cases arc ware- 
housed and another account (as to quality, quantity, and value) is 
sent in ; the whole are registered as imports. Twenty of the ware- 
housed cases are removed to another port and re-exported ; an 
account is sent, and they are entered as exports of foreign goods. 
Twenty are put on board ship as stores at the original port, and 
twenty more removed to another port for the same purpose, and 
of this the central office takes no account ; the remainder are 
removed to another warehouse, still in bond, and on leaving that 
will be treated in one of the four ways just mentioned. Other 
dutiable articles are treated in the same way. 

Goods not sufficiently described or not answering to their 
description are opened, their contents entered on a "bill of 
Bzaminattonof sight," and an account sent in. Private effects are 
fsooA*' separately examined, being described on a "suffer- 
ance " form ; if they are bona-fide personal goods no record is 
kept of them, except in the case of dutiable goods, which are 
treated as ordinary imports. If the dutiable goods are con- 
cealed, either among private effects or merchandise, and forfeited, 
they are not reckoned as imports. 

Bullion is entered on a separate form and kept distinct 
throughout the accounts. 

The duty-free goods, if for transhipment at another port, are 
sent there under seal, and barely examined ; they are treated at 

Free modi ^^ central office in the same way as dutiable 
transfer goods. The remaining free goods, which 
in general form the bulk of the cargo, are entered on such a form 
as follows, which is worth notice, for it is a specimen of the 
rough material from which our foreign trade figures are 
evaluated. 
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Port 












This space 
is for the 


Dork or Station 






use of the 










Officers of 
Customs. 


Importer's 


s Name a 


(No. 


— ) 


Examina- 


Ship's Name. 


Master's Name 


7 — ' — 

Rolf tion No. 




Port or Place whence 


tion. 


Marianne. 


H. Hind. 


980. 


13/10/96. 


Havre, France. 




Marks and 


No. of Packages and Description of Goods, 


Quantity. 


Value, 




Nos. 




;C. 




COK 1392 


One Goods Manuf. N.O.E. Billiard 
Cue Tips - - . . 




28 


. 


AF 495/6 


Two Leather Shoes - 


10 doz. prs. 


58 




KG 340/9 


Ten Cotton Manuf. Trimmings - 
Embroideries 


... 


140 
280 






Piece Goods, not Muslins - 


300 yds. 


8 




FOT i/io 


Ten Gloves of Leather 


11,240 doz. pr. 


12,316 




» "/5 


Five Silk Broad Stufis 


... 


10,400 




„ 16/20 


Five Works of Art- 
Plaster Casts 
Statuary - - . . 


... 


3^ 
1,280 






Pictures by Hand 


3 


10,200 




., 21/5 


Five Books Bound 


4cwt. 


300 




» 26/30 


Five Bronze Manuf. Ornaments - 


3cwt. 


38 




f. 31/5 


Five Metal Manuf. Ornamental 










Brass-headed Nails 


4cwt. 


24 




,. 36/40 


Five Silk Manuf. Dresses, Mantles, 
Trimmings - - - - 




1,816 




„ 41/50 


Ten Goods Manuf. N.O.E.— 
Fancy Goods 
Horseless Carriage 




no 

160 






Brushes . - - . 




78 






Glue 




no 






Billiard Chalk • 




12 






Hardware - - - - 


... 


n6 




•^Vf 


Four Stationery Ink - 




48 




One Iron and Steel Manuf. 










Machinery, British, returned 


3cwt. 


24 




ler 


Iter the above goods as free of 


duty, and c 


leclare 




the above 


particulars to be true. 








Date 


d this 13th day of October 1896. 










(Signed) 


J. Jones, 












Imp 


cotter or his A 


gent. 



The information so received is usually accepted at the 

central office without inquiry. It frequently happens, however, 

that the form is not properly filled in by the agent, the values 

verifloatioiior often being omitted. When this \^ so, it is the 

toto. duty of the clerk at the port of entry to fill in the 

value, in accordance with a list of current prices with which he is 
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provided. It may happen that he has to appraise the goods on 
inspection, a process leading in some cases to great error, which 
is enhanced when not even the quantities are given. When 
there is a palpable error or omission in the form, or when the 
price appears out of the common, a query is sent from the 
central office to the port : e.g,^ with reference to such a form as 
that just given, the following correspondence might arise : — 

1. Pictures by hand, ;f 10,200. Explain high value. Answer. 
— Correct ; invoice was seen ; pictures by Millet 

2. Books bound: is weight or value incorrect? Answer. — 
Both correct ; advice seen ; old and valuable books. 

3. Goods entered as "goods manufactured, chip plaiting": 
explain nature, and state if description is correct. Answer — 
Correct ; wood shaving plaited and occasionally mingled with 
horse-hair, &c. 

4. Potatoes, 40 cwt, £62, Weight or value? Answer — 
Value correct Weight should be 400 cwt. 

Thus any unusual entries are liable to be checked and 
verified. 

In the case of goods not easily valued, or of miscellaneous 
goods not easily tabulated, errors must arise in this way ; and 
PoMiwiity of another error may enter if a clerk, who does not 
eiTon. wish to receive too many queries from head- 
quarters, enters at ordinary rates goods of exceptional value ; 
but when staple commodities and large quantities are involved, 
all the persons concerned will be familiar with the forms they 
have to fill, the prices will be known, and so in important cases 
errors will be at a minimum. The import total values, there- 
fore, are the sum of many quantities of various degrees of 
accuracy, and it is not difficult when looking through the list of 
items in the annual report to see which are specially liable to 
error. Such commodities as old books, works of art, goods 
where sale depends on the fluctuations of fashion, racehorses, 
and so on, have values varying from day to day, and their 
exact value in the balance of imports and exports cannot be 
determined. 

The quantities and values of exported goods are filled in by 
the shipper or agent, and sent to the central office 
within six days of the ship's clearing. The follow- 
ing is an abridgment of the form used : — 
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The forms for British and Irish goods are distinct from those 
for foreign, free and duty-paid, goods; and there are distinct 
export forms for transhipments, which have already been regis- 
tered as imports. In these cases the specification and quantities 
are likely to be correct, but there are causes which may falsify 
the values. If they are to be subject to an ad valorem duty, they 
may be undervalued ; if they are adulterated goods, masquerad- 
ing as genuine, they may be over-valued. It seems hardly 
possible to estimate these errors. 

We are now in a position to define imports and exports 

Definiuon of according to their meaning in the Board of Trade 

oflioiai importi Returns; as, for instance, when for 1895 the value 

andozporti. ^^ imports is stated as ;f4 16,000,000, and of 

exports as ;f 285,000,000, of which <^6o,ooo,ooo are re-exports 

of foreign or colonial goods. 

This total for imports includes all goods landed through the 
custom-houses, including goods immediately shipped as stores, 
or returned from custoifiers unused. Goods immediately re- 
shipped at the same or another port, or held in bond and then 
re-shipped, are included both as imports and exports. Bullion is 
not included, being given separately, nor cargo unlanded and 
so reported, nor personal luggage or private effects, except when 
duty is charged. The value reckoned is the nominal exchange 
value when or just before they are landed ; that is, their value is 
already increased by freight, but not increased by duty. 

The total for exports includes all goods entered on ships* bills of 
lading, does not include ships' stores oc passengers' luggage, nor 
cargo unlanded and so reported, nor bullion, which is given separ- 
ately. The value is reckoned at the time they are put on board. 
Ships leaving our shores to be sold to foreigners are now included. 

The treatment of coal throws light on this paragraph. Coal 
taken for use on the voyage is registered, but not included 
among exports ; coal as cargo is included. 

Among exports not registered are cash taken privately and 
personal effects ; among imports not registered are smuggled 
goods, and cash and personal effects. 

For the causes and extent of the resulting differences between 
imports and exports. Sir R. Giffen's two papers * on the subject 
should be consulted. 

♦ Essays in Finance, Second Series ; :xnd Journal of the Royal Statistical 
Society y 1899. 
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CHAPTER IV. 

TABULATION. 

Leaving now the consideration of blank forms of inquiry, let 
us turn to the methods by which our data, accumulated on these 
forms, can be tabulated. At first sight the tabulation of so many 
million census forms, so many schedules of wages, and so many 
lists of goods imported, seems mere office work, to be done 
mechanically,* only requiring accuracy and not subject to 
scientific analysis. Tabulation does, indeed, involve a great 
deal of automatic labour ; but the determination of the exact 
form of the table and the choice of the headings to which the 
totals shall correspond task the administrative statistician, and 
are worth the closest study. 

The function of tabulation in the general scheme of a statistical 
investigation is sufficiently definite; it is to arrange in easily 
tk«f^ottanor accessible form the answers to those questions 
^^"i^l^^^^-' with which the investigation is concerned. If it 
is required to know, for instance, the number of persons of each 
sex and age-group in all the districts of the country, the figures 
in the table must show these numbers. Or, to take a less definite 
problem, we want all the information possible as to labour dis- 
putes. In studying the forms issued by the Labour Department, 
we have seen that the information which can be obtained is not 
precisely that which we require. The problem then is so to 
tabulate our information that our totals may give answers as 
near to our requirements as possible, and it can easily be 
found by experiment that the way to do this is by no means 
obvious. 

Not only must the figures be grouped so as to answer the 
questions put forward in the original scheme, but if the in- 
formation is of wide and varied interest, as in all the inves- 

* An account of Mr Hollerith's electrical tabulating machine, used in the 
Xlth Census of the U.S.A., will be found in Dr Bertillon's Cours Elimentaire^ 
p. 579 seq. 
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tigations so far considered, the data must be studied from many 
poinds of view, and tabulated so that students in all branches of 
knowledge may be able to extract from our tables the infor- 
mation they require. Thus the population census is used by 
the financier, the legislator, the merchant, and the commercial 
traveller; political economists turn to it for light on the de- 
velopment of industry, and on the change of numbers in 
each trade; those interested in social questions will study 
the ages and sex-distribution in various districts or occu- 
pations ; the sociologist and biologist will need accurate infor- 
mation as to the growth of population and the change of age 
distribution. 

To take more specific points, the blue-book which con- 
tains the tabulation of foreign trade statistics will be ex- 
pected to show how our trade with each country is de- 
veloping, whether we are holding or improving position in 
certain markets ; whether we are exhausting our supply of raw 
materials ; whether some new commodity is yet of importance. 
It must be remembered that the original material is not 
accessible to the public, that they are dependent on the 
information extracted for them, and that, though it would be 
possible to turn through all the forms for special data, yet the 
labour needed would be prohibitive, while a little more detail 
in the tabulation might easily have isolated the information 
needed. 

For convenience, the methods of tabulation may be divided 

into three groups : A. The simple statement of totals of persons 

Three giovpa of Or things which satisfy given conditions, such as 

tabuiatioiiB. ^^ number living in a town, or the total value of 
imports from France; B. The grouping of a great number 
of units in relation to some particular property possessed 
by all — e,g,y the population according to ages, or wage- 
earners according to the value of their wages ; C. The tabu- 
lation of non-numerical answers in suitable groups to give 
a view of the whole — ^^., the causes of strikes or the state of 
employment. 

In the tabulation the cortvenience of the reader must be 
studied. The table must be so arranged that any totals required 
can instantly be found. This is to a great extent a question of 
typography, the use of suitable founts for figures and headings, 
and also of the * choice of the right shape and size of page. 
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Supposing the best possible choice made in these respects, our 
rule will then be to get the maximum amount of information into 
the minimum space. 



Group A. — Thus we can have SINGLE tabulation, answer- 
oiMMi of teira- ing one or more groups of independent ques- 
tions, as : — 



Number and Membership of Trade Unions.* 



Year. 


Number of Trade 

Unions at end of 

Year. 


Total Membership of these 
Unions at £nd of Year. 


Ill 


1,317 
1,307 
1,267 


1,493.375 
1,611,384 

1,644,591 



Double tabulation shows the subdivision of a total according 
to two categories, in the following example according to sex and 
age:— 



Classification of Paupers in Ireland. — Total Numbers who 
received Relief during the Year ended Lady Day 1892. t 



Ages of Persons Relieved. 


Males. 


Females. 


Total. 


Under 16 years 

Of 16 and under 65 years 

Of 65 years and upwards 


44,391 
132,370 

35,121 


43,648 


88,039 
211,416 

80,789 


All ages • 


211,882 


168,861 


380,243 



* Compiled from the Sixth Annual Abstract of Labour Statistics^ p. i. 
t Ibid,, p. 102. 
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More information may be included thus : — 



Classification of Paupers in England and Wales. — Total 
Numbers who received Relief during the Year ended Lady Day 
i892.» 



Ages of Persons Relieved. 


Indoor. 


Outdoor. 


1 
Total. 1 

1 


Melio. 

poiis. 


Other Parts 
^England 
and Wales. 


Under i6 years - 

Of i6 and under 65 years 

Of 65 years and upwards 


111,782 
232,284 
114,144 


441,805 
355.299 
287,760 


558,687 ! 
617,683 1 
410,904 


100,671 

148,066 
64,779 


452,916 
469,517 
337,125 


AU ages • 


458,210 


1,114,864 


1,573,074 


813,516 


1,269,558 



A TREBLE tabulation can be used, subdividing the total into 
three distipct categories, with cross totals for each group. Thus 
the following table gives separate divisions according to age, 
sex, and district ; percentage lines, in a distinct type, are also 
introduced : — 

* Ih'd., p. loi. 
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The same process can be further extendftsd: the example 
in the table opposite shows an arrangement for a QUADRUPLE 
tabulation, distribution by district, date, sex, and occupation, 
with subsidiary information ; but it is generally better to use 
two or more tables than to increase the complication, unless 
it is necessary to bring several categories into close relation. 
Suitable varieties of type will often make comparisons easy in 
a very complex table. 

Looking now at the census householders' schedule (p. 23), 
it will be seen that there are about twelve different items 
- TiibTii«tion of of information about each person: county, town, 
^oenfiumatmriaL parish, position in family, civil condition, sex, age, 
OQCUj)ation, industrial position, infirmity, birthplace, and house- 
room. ' These could be tabulated in 66 different single, 220 
double, or 495 treble tabulations, so that there is plenty of 
scope for choice. '.- •*■.•- u . 

To fix our ideas, we will take occupation as the main sub- 
Mr Boothi division, and examine Mr Booth's use of the census 
tainiuuoiL returns, say for London Printers.* 
First he gives a treble classification — occupation, sex, and 
age — using columns 4, 5, and 6 of the schedule. 



Census Divisions, 2891. 


Females. 


Males. 


Total. 


All Ages. 


.19. 


ao-S4. 


55- 


1. Printer - 

2. Lithographer, &c. - 


809 


9,988 
757 


21,784 
3.037 


1,921 

437 


35,009 
5,040 


Total - 


2,126 


10,745 


24,821 


2,358 


40.049 



Then follows a single table, district and numbers, using the 
information on the back of the schedule. 

Distribution. 



B. 


N. 


w. &c. 


s. 


TOTAU 


5,884 


9,835 


7,577 


16,753 


40,049 



* UJe and Labour of the People^ vol. vi., p. 189. 
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Men. 
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!6 

6 

20 

2 

H4 


III 

136 

14 
3^313 


13 
18 

3 
544 


82 

its 

120 

3,400 


188 
107 

12 

210 
8.845 




)2 


1,699 




10,627 
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Three simple tables are then given, relating to heads of 
families, using columns 2 and 4 (sex), 2 and 10 (birthplace), 
and 2, and 7, 8, 9 (industrial status). 

His next table uses columns 2 and 6, and is as follows : — 



Total Population Concerned. 




Heads of 
Families. 


Others 
Occupied. 


Unoccupied. 


Servants. 


Total. 


Total ... 


18,048 


16,060 


47,257 


854 


82,219 


Average in l^amily - 


I 


.89 


2.62 


.05 


4-66 



The next table (not here given) is a single classification 
according to number of rooms and servants, a most ingenious 
indirect use of the scheduled information ; and the last is an 
example of the legitimate use of a quadruple tabulation — 
occupation, industrial status, sex, and age — given on the next 
page. 
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TABULATION. 8 1 

It would be difficult to find a better example of tabulation 
of a great multitude of details to serve a special purpose. The 

The oenfus census authorities had in many cases not tabulated 

tabniatioxiB. the necessary details, and it was necessary to turn 
through the original schedules to get at the facts. For such 
work as this, the function of tabulation is simply to provide 
the answers to definite questions. Thus the census reports 
show how many persons of each sex and age-group belong 
to certain industries in certain places, in a quadruple tabulation 
extending over many pages, each page relating to one district, 
and this table may be used for accomplishing many separate 
purposes: each item is already a total ready for use. It is 
impracticable from limits of time and space, even if it were 
desirable, to tabulate all the possible groups of qualities which 
can be made from the twelve statements on each census forni ; 
a good tabulation will aim at providing only those statements 
which are of practical use. Thus many simply descriptive totals 
are given, such as the numbers of each sex and age in each parish 
in the United Kingdom, to serve primarily for administrative 
purposes ; and many statements which will afford the economist 
and sociologist the opportunity of tracing the progress of in- 
dustries, of studying the ages of workpeople in different occu- 
pations, the changes in age-grouping of the nation ; and some 
further tables might be given to throw light on problems of 
cause and effect, such as the average ages in town and country, 
the connection between infirmities and occupation, or the ages 
of marriage in various districts or industries. 

It is interesting to open one of these great tables of figures, 
such as are generally to be found forming the bulk of a blue- 

MiniitdA. book, and taking a figure at random, ask " Why 
is this figure printed, what question does it answer, 
to whom can it give information ? " For instance, in the Eighth 
Report on Trade Unions^ p. 257, we find that the United 
Brickworkers' and Brick Wharf Labourers' Union spent ;f 20 
on funeral expenses in 1894, an average of 3s. 7jd. per member. 
As an isolated statement this may interest a very small number 
of persons ; but that small number has a right to expect 
that they shall find the figures relating to their union tabulated 
in a general official book ; to them it may be as important as 
the item, on the same page, of £^A%i spent by the Boiler- 
makers. From this point of view, the question of inclusion of 

F 
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such small items is simply one of space. If space is limited, 
a selection would be made of larger quantities only, as being 
likely to concern more people. 

But there is a reason of quite another character for printing 
such items as these. The raw material, on which the totals 
linportanoe of in such tables are based, is not accessible to the 
i»w materiaL student except by means of this Report Now, the 
compiler of these statistics cannot know from what particular 
point of view they will be studied. It may be desired to 
examine and group trade unions according to th6ir , expendi- 
ture on different items, to study their history, classifying them 
as fighting organisms and as friendly societies. The tabula- 
tions needed cannot well be foretold. The material is there- 
fore given in the rough, in order that the tabulation may be 
made by each student- according to his needs. At the same 
time the most suggestive totals are given as one of these 
possible methods of tabulation ; and in the summary of such 
a report, the items are retabulated, the rough material being 
omitted, in those ways which the editor thinks most useful. 

When space is much too limited for any publics^tion in 

extenso of the items, a careful selection must be made of those 

seieotion of to be printed ; and it is this selection that is 

raw materiaL generally open to most criticism. Owing to the 

great admiration for uniformity generally to be found in the 

official mind, valuable space is wasted on such statements as * — 



COVENTRY: 



I89I 



Shipwright: Ship, Barge, &c., Builder (Wood) 
M f> »> II (Iron) 



MALES. 



while all the males — masters, traders, skilled workmen, labourers, 
errand-boys — engaged in the cycle trade in Coventry are in- 
cluded in — 



Bicycle, Tricycle — Maker, Dealer 



3,854 



In such cases, two useful rules might be applied : omit all 
numbers under, say, 500 when by so doing a line of print 
would be saved; and give all numbers over 10,000 correctly only 
to the nearest 100, and so for other digits in proportion, thereby 

* Census Report. 
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reducing the width of columns of print. If, for example, we 
knew to the nearest icx) the exact numbers in each district 

Eoonomyof ^^^ occupation in which as many as 1,000 were 
■P***^ employed, our knowledge would be as com- 
plete as we needed ; and it is doubtful whether the space 
occupied by this tabulation would be more than that already 
devoted to the subject. In many cases, on the other hand, it 
is essential to have the raw material quite unchanged. Each 
tabulation must be judged on its own merits. 
^ It may>be useful to take a particular group of answers, and 
discuss what tabulations will throw most light on the questions 
Tttirauttonofihe ^* issue. The Poor Law Commissioners of 1833 

Poor Law collected information from a thousand villages in 
Befennu, I8S8. gj^gi^nd and Wales on the following six points 
among others : the wages of an agricultural labourer in summer 
and in winter, both with and without the inclusion of beer as 
part payment, his annual earnings, and the subsidiary earnings 
of his wife and children. It may be supposed that the chief 
object of the Commissioners was to find whether the labourers' 
families earned enough for their support, and what proportion 
was earned by the wives and children. 

The following scheme of tabulation would show in what 
counties the labourer was badly off: — 



County. 


Average Annual Earnings of 


Man. 


Family. 


Together. 




1 







The counties might be taken in alphabetical order for con- 
venience of reference, or in geographical order with subordinate 
averages for gjroups (e.g,, Eastern : Norfolk, Suffolk, Essex) ; or 
the counties might be arranged in the order of the total earn- 
ings, so that it could be seen at a glance in which counties the 
labourers were worst off. 

To show the number of villages, county by county, in which 
the earnings were below a certain minimum, or within certain 
limits, the following table might be used :— 
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Annual Earnings of Men and Families. 



Number of Villages in which the Total Earnings averaged 


Average Earnings in 
of 


County 




£ns. 


14 


io 




'1 




1 


1 


1 


H- 






<^ 


-< e 


< a 


<§ 










In Norfolk - 


o 


I 


3 


6 


4 


3 


2 


/30 


/I I 


Al 


Percentages of 






















TotalNumber 






















of Villages - 


o 


s 


i6 


J/i 


^/ 


/^ 


/oi 


7J 


27 


... 


In Suffolk - 


o 


3 


4 


5 


3 


2 


2 


;f28 


£ii 


£39 


Percentages of 






















TotalNumber 






















of Villages - 


o 


i6 


2r 


^ 


i6 


70j 


M 


72 


jfS 


... 


In Essex 


I 


3 


6 


7 


IO 


3 


I 


£2% 


;flo 


;f38 


Percentages of 






















TotalNumber 






















of Villages . 
In Eastern 


3 


iO 


J^9 


^3 


J-? 


lO 


J 


74 


^(5 


... 






















Counties 


I 


7 


13 


i8 


17 


8 


5 


£^ 10 


;{^I0 10 


/39 


Percentages of 






















TotalNumber 






















of Villages - 


/ 


IO 


t9 


^ 


^S 


/J9 


7 


73 


^ 


... 



This table can be used in the above complex form or simpli- 
fied. The number of subdivisions of money to be distinguished 
depends on the space at disposal and on the number of villages 
which would be entered in each. A table in which most of 
the entries are i or o is open to criticism. In the above table 
the villages are too few to allow accuracy in percentage. 

It will be seen that this table would furnish the answer to 
almost all questions which could be put as to total earnings. 
Tanniattonto For instance, if we wish to see the relation between 
diow oorwuuon. total earnings and the family's subsidiary con- 
tribution, we should look at the smallest totals in the last 
column and see if they corresponded with the largest percentage 
of family earnings. If we found signs of correspondeAce we 
should re-arrange the counties in the order of these subsidiary 
percentages, and see if they were approximately in order of 
total earnings also. This is an example of tabulation to show 
correlation, the correspondence in the occurrence of two sets of 
phenomena. 

Another important group of questions arising in conAect^n 
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With these tables is : What is the relation between weekly wages 
wagMuid ^^^ annual earnings, and what proportion of the 



wage is generally paid in kind? We shall not 
now require the statements as to subsidiary family earnings. In 
records of agricultural wages the most common statement is, e.^.y 
"wages in this district are from los. to I2s. a week." Now, a 
farm labourer does not generally earn as much in winter as in 
summer, because wages are reduced to correspond to the smaller 
amount of work necessitated by failing light ; from this cause 
annual earnings will be less than the weekly wa^e multiplied by 52. 
Besides this wage he generally receives special money at hay and 
wheat harvests,and also many payments in W mH -&iir|^ ^q Hailyhpf?!" 
house and ground at reduced rent, and other privileges. It is 
generally best to value all these, and compute his earnings thus:-v 
I OS. for 38 weeks - j£ig 



1 2S. for weeks 9 (summer) 
Hay harvest, i week 
Wheat harvest, 4 weeks - 
Beer, is. per week - 
Cottage and ground 
Other perquisites 



o 
8 

15 
o 

12 
o 
5 



^39 o o = I ss.- per week. 

In this case earnings are 50 per cent, above the general 
weekly wage. An estimate of this nature has been made by the 
late Mr Little for each county for 1867-70 and 1892. We can 
tabulate the figures for 1833 in the same way for comparison, 
in geographical order and with the county as unit. We must 
first consider the question, Has beer been a t all 
ge nerally ^^plfj^'^ j^by mo ney? We can tabulate 
the figures as followsto answer this : — 



y^ 



«833. 


1892. 


X. 

County. 


3. 
Average 

Summer 

and 
Winter 
Weekly 
Wages. 


3. 

Average 

Earnings 
per Week, 


4- 
Difference 


5- 

Number 

of Villages 

where 

Beer is 

given. 


6. 

Propor- 
tion to 
Total. 


7- 

Difference 

between 

Wage and 

Earnings. 


8. 

Excess or 

4 over 

7- 


















I' 
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In column 2 should be given the county average of the 
wages stated, without making any cash allowance when beer is 
given. Then if money has been replacing beer, we should find 
that in those counties where beer was most often given, wages 
had risen relatively to earnings more rapidly than in the 
counties where free beer was rare. Columns 4 and 7 show the 
differences for the two dates. When the entry in column 4 is 
greater than the corresponding entry in column 7, kind has 
been replaced by money. These excesses would be given in 
column 8. If money has replaced beer, the counties which 
have the greatest entries in column 6 should also figure high in 
column 8, and vice versa. 

The question, Are winter wages generally below summer 
winter and wages, and by how much? can be answered by the 
lunmar wagM. following scheme of tabulation, which uses the data 
not employed in the previous tables : — 





Average Weekly 
Wage in 


Number of Villages where the Excess of 
Summer Wages over Winter was 




Summer. 


Winter. 


Nothing. 


6d. 


IS. 


xs.6d. 


as. 


More 
than as. 




s, d. 


s. d. 














Norfolk - 


II 2 


10 3 


13 


2 


3 


2 


5 


3 


Percentage of Number of Villages 
included 


46 


7 



II 


7 
I 


j8 
2 


II 


Suffolk - 


10 2 98 


24 


6 


I 


PercetUagt of Number of Villages 
included 


70 






j8 


3 


6 


3 


Essex - - - 10 9 9 10 


22 


II 





5 


4 


Percentage of Number of Villages 
included 


5^ 




2 


26 



3 


12 

4 


10 


Eastern Counties 10 6 | 9 11 


59 


20 


12 


8 


Percentage of Number of Villages 
included 


57 


2 


19 


3 


12 


8 



These examples do not quite exhaust the useful tabulations 
of these groups of figures, for we have not yet examined the 
distribution of wages, that is the relative numbers paid at 
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different rates. These returns do not, however, illustrate such a 
tabulation well, for we are not told the rates paid to individuals, 
but only the rate prevalent in the villages. 

Group B. — The grouping according to wages affords an 
example of the second method of tabulation. We have now 
no definite questions to answer, as in the method so far discussed, 
but a more general problem : given a mass of data, it is re- 
quired to tabulate it, so as to present the maximum amount of 
useful information. Our raw material is so many thousand 
isolated statements, which must be focussed, made to present 
definite meaning, and worked up so as to be useful for future 
comparison. 

Some investigations are undertaken not to answer any de- 
finite questions or to throw light on any given problem, .but to 
statistici irkose ^^'^^^^ information which, though it has no imme- 
irarposs !• not diate use, is likely to be needed ultimately by many 
defliute. investigators occupied with various questions. Such 
is a wage census. So long as we have no sufficient account of 
wages, we are badly informed as to one of the most important 
measurements of the social body, and economists and statisticians 
are continually hindered by the want of data essential for their 
wbrk ; but the census has no immediate practical use, for knowing 
the height of wages does not help us directly to regulate that 
height. In such an investigation our object will be to examine 
the figures, and give all the groupings and averages which seem 
likely to be useful for any purpose ; and while doing this we 
shall imperceptibly pass to a different class of investigation ; 
we shall be finding a structure underlying our multifarious 
details ; we shall find that the chaos, which our figures present 
at first sight, obeys laws ; we shall be making a visible outline, 
and giving a definite shape to our apparently featureless mass. 

The complete discussion of this problem belongs to a later 
chapter ; but the tabulation can be begun without special 
technique. The examples taken will relate chiefly to wages, 
but the methods are quite general. 

In the American Report on Wholesale Prices^ Wages and 
Transportation of 1 89 1, the wages of some 10,000 persons are 
detailed. It is proposed to consider their tabulation as a homo- 
seieouon of limits geneous group. The results are given on pp. 91-2. 

of gnmpi. jj^ ^j^g original publication the wages are given 

to half a cent ; in the second column, on p. 91, the numbers of 
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wage-earners are given in lo-cent groups, from $.25 to $.34, $.35 
to $.44, and so on, those earning wages exactly at the dividing 
points being always placed in the division below. Notice that 
the average wage of such a group as $2.15 to $2.24 is not $2.20 
if the wage-earners are evenly distributed cent by cent, but the 
average of $2.15, $2.16, . . . $2.24, />., $2,195. 

Looking at column 2, it will be seen that the figures present 
no order, follow no rule ; no structure has yet been found, our 
divisions are too narrow for our material. 

Now group the wage-earners with wider limits, as in column 6, 
where the numbers earning in half-dollar groups are given ; we 
have here a nearly regular sequence of numbers falling after the 
maximum in the second group. Going back to narrower limits, 
to find exactly at what divisions this regularity is first in evidence, 
we have in column 4 the numbers in 20-cent groups which show 
considerable, but not absolute regularity. The numbers in 
30-cent groups* are successively 75, 355, 674, 1,242, 740, 660, 
343, 310, 180, 181, 233, 32, 82, 3, 4, 8, I, almost completely 
regular except for the large group at $3.50. 

The question AS to which of these groupings should be selected 
is to be decided by the number of separate items the eye can 
instantaneously grasp. In looking at the 25 numbers in the 
20-cent groups, or the 18 in the 30-cent, the meaning is lost in 
a maze of figures (though as many details as these could be 
properly shown in a diagram), but the 1 1 numbers in the half- 
dollar groups are easily comprehended. 

Stated in words, the result of our tabulation (column 7) is 
that ^ per cent, of the wage-earners made from $.25 to $.74, 
28 per cent from $.75 to $1.24, and so on. 

For the practical work of the tabulation from the original 
figures, we should take ruled sheets, enter at the head of successive 
Praotioai taim- columns certain wage limits, and turning through 
utioB. the items enter each wage by a dot in its appro- 

priate column, grouping them in fives and tens, to facilitate 
addition. 

From the preceding paragraphs it is clear that we do not 
need to take separate columns for each cent from $.25 to $5.35 
for tabulation, but a little consideration is necessary to see how 
minute the limits should be to give the correct average. 

* Videp, 121 tn/ra. 
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Suppose the entries in cent groups to be : — 



$1.70 


$1.71 


$1.72 


$1-73 


$1.74 


• • 
• 

• • 
• 


• • 
• 

• • 

• • 


• 
• 
• 


• • 
• 

• • 

• • 
• 

• • 


• • 
4 

• • 
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The average of the wages so entered can be quickly calculated 
as$K7i8. i;7^'>"*" 

If, on the other hand, "we put all the 46 entries as simply be- 
tween $1.70 and $1.74, or more exactly as much as $1.70 but less 
than $1.75, we should naturally take them to be all (for purposes 
of averaging) at the middle point of this group, viz., $1.72. 

If we have a sufficient number of items, the differences 
between the average assumed and that calculated for each group 
will be very slight. This is seen on p. 91 ; column 8 gives the 
averages calculated from the entries in lo-cent groups, while 
column 9 gives them on the hypothesis that for purposes of 
averaging the numbers in the half-dollar groups are all at the 
middle points of their groups. The difference is greatest in the 
first and last, the smallest groups. The general average obtained 
from column 9 is $1.70, which is the nearest round number to the 
true average $1.73. Hence, for the purpose of obtaining the 
general grouping and average, we need only take 1 1 half-dollar 
columns for marking in our items. 

For other purposes it may be advisable to work more minutely ; 
for in the lowest group, we shall wish to know how many are 
earning $.25, $.30, $.35 separately, for 5 cents is a perceptible 
difference on 25 cents. At the top also it may be useful to know 
the exact wages. 

More minute entries again will be needed for the second 

method of tabulation, which is as follows : — Suppose all the 

The Oflitonio wage-earners to be arranged in order of the magni- 

method. tude of their wages, tho^ at $.25 at one end, those 

at $5.75 at the other. Note the wages of men at given points in 

the row. The lowest wage is $.25 ; one- tenth of the way along. 
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that of the $ 1 2th worker is between $.85 and $.95, . 
way up the wage is $1.50. The figures at each tenth at 
on p. 92. By this means we get a very vivid idea of th 
bution according to wages. 

These numbers cannot be obtained accurately if w^ 
entered the details correct to half-dollars, but can be 
the lo-cent grouping, which is therefore the classii5<! 
be adopted. We must first determine in which of 
groups the men one-tenth, two-tenths ... up the group lie, 
and then estimate their position inside the smaller group. 
Thus, if we want the figure more accurately than "between 
$.85 and $.95," as given above, we proceed as follows: — The 512th 
man from the bottom is the 82nd man in the group between 
$.85 and $.95, for there are 430 earning less than $.85 ; this group 
contains 169; if they were distributed regularly, 17 to each 
cent, the 82nd man would be half-way through this group, 
between $.89 and $.90. The hypothesis of even distribution is 
sufficiently correct for most purposes, and this method affords 
a sufficiently accurate means of determining the wage of the 
workers at the tenth places. The resulting figures are given on 
p. 92. If, however, we want to know the wage of the half-way 
man more exactly, we see from the half-dollar groups that it is 
between $1.25 and $1.75, a rough approximation shows it to lie 
probably between $1.45 and $1.55, and then we rapidly tunr 
through our original data, isolating the wages at $1.46, $1.47, 
. . . $1,55.* ..--^~^ 

A slight modification of this method is also useful. Take the 
average of the lowest 512 (or tenth), namely, $.70 J ; of the next, 
namely, $1.03 ; and so on (see p. 92). These figures also give a 
vivid view, and are very convenient for comparisons with other 
groups. 

The figures so far apply to only half of the data in the 
Senate Report. On p. 92 the whole are tabulated to give the 
average wages of the successive tenths. A comparison of the 
two groups so obtained shows how far the first half was typical 
of the whole. This method will be dealt with in a later chapter. 

♦ On this method see pages 127, 128. 
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Tabulation of Wages— American Figures, 189 


I. 


z. 


2. 


3 




4. 




5' 


6. 


7. 


8. 9. 


Earntns Daily 


No. of 






No. of 






No. of 


Percent 


Average 
Wagetn 


Wages. 


Persons. 






Persons. 






Persons. 


age. 


Group. 


$ 




) 








$ 








asmoch and less 




as much and less 




as much and less 








as than 




as 


than 




as 


than 








.25 .35^ 
.35 .45 

.45 .55 r 
.55 .65 

.65 .75^ 
•75 .85t 


l\ 
15/ 


.25 


.45 


'^ 










$ instead $ 


i'} 


.45 


.65 


,44 


.25 


.75 


317 


6.2 


.62 of .50 


1571 
"3/ 


.65 


.85 


J7oi 


3t" 










.85. .95f 

.95 i.osV 


169^ 


.85 


1.05 


370! 


•75 


1.25 


C2 


28.7 


1.09 I.OO 


1.05 1. 15 
1.15 1.25^ 


304 1 
685/ 


1.05 


1.25 


989 












1.25 1.35 \ 
1.35 1.45! 


72/ 


1.25 


1.45 


557 












1.45 1.55 > 
1.55 1-65 


. 1.45 


1.65 


538 


1.25 


1.75 


«,»<» 


J5.3 


1.49 1.50 


1.65 1.75^ 
1.75 1.85^ 


202 J^ 
329 i 


1.65 


1.85 


531 












1.85 1.95 
1.95 2.05 V 


58 1 
273 1 


1.85 


2.05 


331 


1.75 


2.25 


970 


18.9 


1.99 2.00 


2.05 2.15 
2.15 2.25; 


^l 


2.05 


2.25 


310 












2.25 2.35 \ 
2.35 2.45 


33 1 

lOI J 


2.25 


2.45 


134 












2.45 2.55 > 
2.55 . 26. 5 


1961 
13/ 


2.45 


2.65 


209 


2.25 


2.75 


506 


9.9 


2.53 2.50 


2.65 2.75/ 
2.75 2.Ss\ 


163 1 
2 f 


2.65 


2.85 


165 












2-85 2.95 
2-95 305 - 


'51 
129/ 


2.85 


3.05 


144 


2.75 


3.25 


198 


3-9 


3.04 3,00 


3.05 3.15 
3.15 3.25J 


51 
47/ 


3.05 


325 


52 












3.25 3.35\ 
3.35 3.45 


12 1 
of 


3.25 


3.45 


12 






• / 






3.45 3.55 
3.55 3.65 
3.65 3-751 
3.75 3.85^ 


221 1 
5 I 


3.45 


3.65 


226 


3.25 


3.75 


* 254 


5.0 


3.51 3.50 


16 \ 


3.65 


3.85 


27 












3.85 3.95 
3.95 4.05 • 



82/ 


3.85 


4.05 


82 


3.75 


4.25 


96 


1.9 


4.00 4.00 


4.05 4.15 
4.15 4.25. 


0I 
3/ 


4.05 


4.25 


3 












4.25 4.35^ 
4.35 4.45 


o\ 
0/ 


4.25 


4.45 















4.45 4.55 • 
4.55 4.65 


n 


4.45 


4.65 


4 


4.25 


4.75 


4 





4.50 4.50 


4.65 4. 75 J 
4.75 4.85^ 


;} 


4.65 


4.85 















4.85 4.95 
4.95 5.05 


'] 


4.85 


505 


8 


4.75 


5.25 


1/-8 


.2 


t 
5.00 5.00 


5.05 5.15 
5.15 5.25^ 


o\ 
0) 


5.05 


5.25 







^t 5.35 


I 




5-35 5.25 


5.25 5.35 
Totals - 


I 


5.25 


5.35 


I 










5. 123 


5,123 


5*123 


100 


Average Wage 


$i.73« 














Avera 


ge Wage $1.70 
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Wages of "Tenth* 


* Men {^deciU*\ 


Lowest Wage - 




' $.30 


Ath up Group 






Ath „ 




1. 12 


Ath „ 




1.22 


Ath „ 




- 1.39 


Ath „ 




- 1.49 


Ath „ 




- 1-75 


Ath „ 




- 1.99 


Ath „ 




- 236 


Ath „ 




- 2.98 


Highest - 




- 5.35 



Average Wage of 


Same for 

xo,ooo 
Workers. 


Lowest tenth - $.70 
Second „ - 1.03 
Third „ - 1. 18 
Fourth „ - 1.28 
Fifth „ - 1.44 
Sixth „ - 1.59 
Seventh „ - 1.86 
Eighth „ - 2.14 
Ninth „ - 2.59 
Highest „ - 3.51 

General Average 1.731 


.79 
I.OO 
1.24 
1.50 

2.00 

2.22 

2-58 
^•55 


1.8a 



The tabulation of the data collected for the WAGE CENSUS 
on such forms as that on p. 36, illustrates well some of the 
difficulties involved. The items given on the main part of the 
schedule are of this kind : — 

No. Average Wage. 

Spinners — Time: 6 12s. : 5 6^ hours. 

Such returns are not perfectly definite, for if many are 
employed in the same occupation in a mill, it is possible th?it 
Taimiatianiii they will earn at different rates. Thus this entry 
uiewagooonsiit. Qf g ^t I2S. might arise from either 6 men each 
earning 12s., or 2 at los., 2 at 12s., 2 at- 14s. (average 12s.); 
or 4 at I2S., I at iss., i at lis. ; or 5 at 12s. and i at i8s. — I2s. 
being the general rate, but not the average, in these last two 
alternatives. Since the purpose of the wage census was to 
give a comprehensive account of wages adapted for use in all 
investigations, it should show the numbers in all trades and 
subdivisions of employment by age, sex, and district, the average 
and general rate of pay for each group, and sufficient details to 
show the distribution about the average in each group, for a 
mere average may conceal exceptionally high or exceptionally 
low wages. 

On inquiry at the Labour Department as to whether the 
original information had been given in a more detailed form than 
the line above, or whether divergencies might be concealed, the 
author learnt that the subdivision of occupations had been carried 
to such an extent, that in practice, where there was any great 
variation in the wages of workers under one heading, that head- 
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ing had been split up, so that each group was separately entered, 
or that several groups were distinguished under one heading ; and 
that when there was reason to believe from the light of other 
returns that this had not been done, supplementary inquiries 
were made on this point, so that the original data were detailed 
enough for any reqyisite fineness of tabulation. 

The problem then was to tabulate the answers from the 
various factories in a district, to show clearly and succinctly 
the distribution of wages in each subdivision and in the whole, 
can hardly be said with confidence that the method adopted, of 
which a specimen is given on p. 94, is entirely satisfactory. 

To clear our ideas let us suppose that the details on which 
the line relating to throwsters (time) was based were as 
follows : — 

3 earning 14/ - "average minimum rate." 

14 » IS/ 



6 „ is/6 

20 „ i6/ 

10 „ 17/6 

20 „ 18/ 

8 „ 18/6 

10 „ 19/ 
10 

8 



68 within 10 per cent, of the average 
for all, which is 17/7. 



20/6 1 

I > 18 earmng 20/11 on the average. 



The process adopted in the tabulation may be supposed to 
have been to separate from the whole group of returns a small 
varioumetiiodB group of old men or inferior workers .earning far 
poMiidA. below the average, and enter them as a distinct 
minimum group, and to separate a small group of the most 
skilled workers and enter them as a maximum group. This, 
is better than giving simply the highest and lowest of the 
individual wages, for either of these may be due to excep- 
tional circumstances, and may be quite a long way from that 
paid to any other person. The exact size of these extreme 
groups must be determined from inspection of the returns them- 
selves. After this has been done, the remaining wages may not 
be grouped close together ; in the example taken they are 
scattered between 15s. and 19s. To give some clue as to this 
distribution the number earning within 10 per cent, of the 
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average is stated ; this is probably the best way if only one 
column can be devoted to it, but lo per cent, is a wide limit 
to adopt. Another method would be to give the limits within 
which the wages of the lo per cent, of the earners above and 
ID per cent below the average were contained : in this case i6s. 
and 1 8s. 

If, however, not more than 8 columns are to be devoted to 
each group, the following arrangement would give much more 
definite information, and it could have been made from the data 
in hand, and would be well adapted for all the purposes for 
which it would be required. 

Number employed - - 109 

General average - - - - 17/7 

Average of lowest tenth ♦ - - 14/9 

Quartilet - - - - 16/ 

Mediant - - - - 18/ 

Quartilet - - - - 19/ 

Average of highest tenth ♦ - - 21/2 

We are fortunately not dependent solely on the tabulation 
The gMianu as given above, for wages in industries as a whole 

■''"'**^- are also tabulated on the following plan, which is 
in a form most useful for purposes of comparison (p. 96). 

The lines giving percentages are most useful. We can at a 
<* glance compare the levels of wages in different industries. Thus 
in the cotton manufacture the average wage is 2s. higher than in 
the woollen ; and in the cotton there is a large group of highly 
skilled workers earning from 30s. to 3Ss., while in the woollen 
nearly half are close to the average, earning between 20s, and 
2Ss. In the jute and linen manufactures the averages are nearly 
the same, but in the former a larger proportion are below the 
15s. limit In the silk manufacture there is an aristocracy as in 
the cotton, but it is smaller and better paid, for 12 per cent 
earn more than 35s. This table is a masterpiece of concentration 
and clearness. 

We will discuss next the tabulation of the figures relating 

* Vide p. 92. t Vide p. 124. 
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to CHANGES in RATES of WAGES collected by the Labour 

Tkkimiatioii of department. Specimens of the forms by which 

obaogt of such information is obtained were given among 

wagMnrtunn. ^^^^ relating to strikes (p. 57). Referring to 

them, it will be seen that the facts given are the occupations 

and numbers affected, the dates from which the changes took 

place, and the wages and hours in a full week exclusive of 

overtime (a definition corresponding exactly to that used for the 

wage census) before and after the change. 



ExTRACi' FROM Table showing the Changes in Rates of Wages and 
Hours of Labour of Ordinary Agricultural Labourers in Various 
Districts of the United Kingdom in 1894, so far as reported to the 
Board of Trade.* 



Countjrrfbd Union. 


Particulars of Changes in 
Samraer Wages. (1694 com- 
pared with 1893.) 


Particulars of Changes in 
Winter Wages. (1894 com- 
pared with 1893.) 


No. of Male 
Agricultural 
Labourers, 

Farm Servants, 
Shepherds, 

Horsekeepers, 
Horsemen, 

Carters, in '91. 


Increase. 


Decrease. 


Increase. 


Decrease. 


Lincolnshire^ 
Gainsborough • 
Louth - 
Spilsby - 

Norfolk— 
Aylsham - 
Docking • 
Flegg, East and 

Forehoe - 


... 


Per Week. 

1/(12/ to 11/) 
6d.(i2/6toi2/) 

1/(12/ to 11/) 


Per Week. 
i/(io/.ii/) 


Per Week. 

i/6(i5/toi3/6) 
i/6(i3/6toi2/) 
i/6(i3/6toi2/) 

1/(11/ to 10/) 
1/(11/ to 10/) 


2,466 
3,932 
3,288 

2,576 
2,487 

1, 108 
1,448. 



* From the second Annual Report on Changes of Wages^ pp. 198-9 ; a 
little cooipressed. 
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Extracts from Table showing the Changes in Rates of Wages of 
Ordinary Agricultural Labourers in Various Districts of the United 
Kingdom in the Summer of 1895, so far as reported to the Board 
of Trade.* 



County and Union. 


No. of Male 

Labourers, Farm 

ServanU, 

Shepherds, 

Horsekeepers, 


Pardcnlars oT 

Changes in Sum> 

mer Wages (1805 

compared with 

X894X 


Weel 


LlyRa 
inSui 


te of Wages 
mmer. 






Teamsters, 
Carters, in 1891. 


Decrtast* in 
italics. 


1894. 


1895. 








Per Week. 


X. 


d. 


*. d. 




Durham— 














Stockton* - 


437 


Dtcrecae of6d. 


17 


6 


17 




Tccsdale - 


669t 


Advance of 6d. 


17 


6 


18 




(Barnard Castle 














Rural Dist).* 














OXFORDSHIRB— 














Headington - 


1,118 


Decrease of is. 


12 





II 




Henley 


i,587t 


Decrease of is. 


12 


oto 


II Oto 




(Hambleden Rural 






14 





13 




Dist., Bucks). 














Norfolk— 














Flegg, East & West 
Forehoe 


1,108 


Decrease ofis» 


II 





10 




1,448 


Decrease of is. 


II 





10 




Henstead 


1,504 


Decrease of is. 


II 





10 




Mitford and Laun- 














ditch 


3,622 


Decrease of is. 


II 





10 




Smallburgh - 


2,264: 


Decrease of is. 


II 





10 




Swaffham - 


1,942 


Decrease of is. 


II 





10 




Wayland - 


1,535 


Decrease of is. 
Labourers with- 


II 





10 




Carnarvonshire— 




out food, ad- 


-19 





20 




' Carnarvon - 


I,I24t 


vance of IS. 










(Gwyrfai Rural 




Labourers with 


i 








Dk.). 


V 


food, advance 
of IS. 


-II 





12 





* Agricultural labourers in this district are hired in March and April for a year 
certain, and the change noted applies to the whole year, and not to the summer only. 

t The number of agricultural labourers, &c., is for the Poor Law Union, but the 
change applies to the Rural District only. 

X This number is partly estimated. 



* From the third Annual Report on Changes of Wages, pp. 118, 119, 121 
(typography adapted). 
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The adjoining tables give examples of the way in which the 

changes in agricultural wages were tabulatqd in the Second and 

Aaimdtnna Third Report on Changes in Rates of Wages and 

wacM:OhMig» Hours of Labour. In the first table space is 

tn teimutioii. ^j^gted by devoting separate columns to increases 

and decreases, with the intention of making the table distinct ; 

while it is not clear whether "Winter 1894" means the winter 

beginning in or that ending in that year. 

In the second table, which refers to summer wages only, the 
columns are rearranged ; and increases and decreases printed in 
the same column, the latter in italics. In the Fifth Report all 
the information is printed in a clearer way, thus : — 

Winter Wages.* 



Distnct. 


Nnmber. 


Weekly Rates. 


Increase or Decrease per 
Week in 1897. 


Tendring 


3iii3 


Jan. '96. Jan. '97. 
*. d. , *. d, 
10 j II 


Increase. 
I 


Decrease. 



The tabulation is repeated for the summer. 

The weakness in these agricultural returns is in the numbers 

column. In the returns from other industries the numbers given 

Thenubw ^^e those actually affected, but in this case it is not 

affaot«L found possible to obtain this number correctly, and 

the number entered is that found under " agricultural labourers " 

in the 1 89 1 census, which includes the various categories as given 

in the above table. When a change of wages takes plar g in a 

^ rural district, we may perhaps assume that it is likely to be 

general, though if it was a reduction, it might not be made. 

by the better employers ; and though the change will not 

take place in thd same week throughout the district, t here 

Js jao.t likely to be much variation in this respect. The 

change is generally made at the time that winter jtages 

give pl ace to summer, or summer to winter; and a slight^ 

incr ease or decrease may take place by making the winter 

/eduction or the summer advance later than usual. On thfi. 

whole, little error will be introduced by assuming that the change 

stated affects all the adult agricultural labourers in the district, 

* From the fifth Annual Report on Changes of IVages, p. 145. 
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and it is quite probable that a proportional change* will take_ 
jjlace- in the wages of horsekeepers, shepherds, and others, 
though it may not in the case of boys, or old men who are 
earning less than the district rate. The question, " Approximate.- 
oiumber of able-bodied labourers in parish?" is asked on the 
inquiry form, but as the answers are not used, it may be 
assumed that they are generally not given with sufficient 
exactness. 

The object of the whole tabulation is to show the change in 

the national weekly wages bill, but many details are lacking for 

the complete calculation. In the cas^ of agricultural labourers, 

we need, in addition to these data, accurate statements of the 

change of additional earnings, special payments, 

and payment in kinds. In all cases we need^^ 

more complete account of the whole wage-bill as well as the 

change. For agricultural labourers the material has just been 

4)ublished by the Labour Department ; * every year it receives 

returns from most of the 600 unions as to wages at all seasons, 

whether there has been a change or not. 

The looseness in the returns as to numbers does not prevent 
our calculating the change in the county or country rates, for 
Ghaogei in the numbers in each district affected by the chajsge. 
ooimtyrAtM. maybe expected to bear the same proportion to 
the numbers given in the census returns, as the number of agri- 
cultural labourers of the same class in the whole county or 
qountry does to the census number. 

The calculation for Durham in the above table for the 
changes in summer wages 1894-95 may be performed as 
follows : — 





Average before 
change. 


Change. 


Proportional * 
number affected. 


Amount or change 
on wage-bill. 


Stockton - 
Teesdale - 


s. d, 
17 6 
17 6 


-6d. 

+ 6d. 


4 
7 


s, d. 
-2 

.3 6 



Total change in county, + is. 6d. 
Proportional number in county, 73. 



•/6 



Effect on county average, -^- ~ Jd. 

Here, for simplicity of calculation, the numbers affected are 

* On these points see Mr Wilson Fox*s Report on Wages and Earnings 
of Agricultural Labourers^ 1900, p. 50, and pp. 111-157. 
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taken to the nearest icx), a process which is not likely to affect 
the average perceptibly.* This rough method is likely to give 
the result as accurately as the original data make possible. A 
similar process with suitable modifications can be applied to the 
changes tabulated for other industries. The summary of such 
returns for agriculture for all counties is as follows : — 

Comparison of the Net Effect of the Changes of Cash Wages 
per Week paid in the Years 1896 and 1895 ^^ certain Districts 
in England and Wales.-)- 



District. 


WaGBS in Z896 AS COMPARBD 
WITH X895. 


Wages in 1895 as com parbd 

WITH 1894. 


Total •* 
Number. 


Net Effect of Changes 

on Weekly Wages. 

Increase (+) and 

Decrease (-). 


Total •* 
Nunber. 


Net Effect of Changes 

on Weekly Wages. 

Increase (+) and 

Decrease (-X 


Total. 


Per Head. 


Total. 


Per Head. 


England— 
Northern Counties - 

Yorkshire, Lanca- 
shire, and Cheshire 

Eastern and Midland 
Counties 

Southern and Wes- 
tern Counties 

Wales 


5,662 

2,897 

69,869 

20,901 


£ 

-43 

+ 100 

+ 666 
-340 


-0 li 

+0 8J 
+0 24 
-0 4 


3,766 

3,942 

89,576 

20,441 

2,16s 


£ 
+ 44 

-126 

-2,045 

-575 

+ 73 


d, 
+ 2J 

-71 

~5J 

-6i 

+ 8J 


Total - 


99,329 


1 
+ 383 +0 I 


119,890 


-2,629 


-5J 



** The number given is the total of male agricultural labourers, farm servants, shepherds, horse- 
keepers, in 1891, in the Poor Law Unions in which the changes took place. 



* The corresponding calculations for Oxfordshire are : — 
12/ ' -1/ II 

13/ - 1/ 16 

Effect on county average, ~*/'- = - 2d. 
loi 

For Norfolk :— 

12/ - 1/ 134 

Effect on county average, " -^ = - 4d. 
425 

t From the fourth Annual Report on Changes of Wages^ p. xliv. 



-11/ 
-16/ 
-27/ 



134/ 
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The value of this table is not obvious. It seems of little 
importance to know how many persons were affected altogether ; 
oritidiBi of though it is of some value to learn from a previous 
tunuiiajT uue. ^^le that 58,578 persons received increases, and 
40,751 decreases in 1896. This total of persons affected is con- 
stantly given in these tables; if a person receives an increase of is. 
one month, and loses it the next, he is counted as 2, and his con- 
tribution to the next column (net effect of change) is zero. This 
-£4S may mean that 2,000 persons received a decrease of is. each, 
and the remaining 3,662 (same or different persons) an increase 
of 3fd. each, or any other figures which would give the same 
total. The change per head in the next column is unimportant ; 
it only shows an arithmetical quotient with no concrete meaning 
that can be expressed in words. If it was replaced by another 
quotient, viz., ^, where n is the number of agricultural 
labourers in the Northern Counties, we should know the effect 
on average wages. In fact, the table would be more useful 
thus : — 



Approximate Effect of Changes on National Weekly 
Wage Bill. 



District. 


Incrbasss. 


Dbcrbasbs. 


Net 
Change. 


Total No. 
Employed. 


Average 

Change. 


No. 
affected. 


Total. 


No. 
affected. 


Total. 



















The figures given supply an example of the common practice 
of carrying out into detail a calculation which depends originally 
on incorrect numbers, in this case the number employed, and is 
therefore misleading throughout. Till the average (useless here 
in any case) is taken, the error in this quantity has no injurious 
effect. As shown above, the average here given could be replaced 
by another which would be of use, and which would be corre ct 
within limits that could be defined, and would be narrow enough 
for most jiurposes. 

Further, since the column of numbers affected is admittedly - 
wrong, the^ figures should be given to the nearest 1,000 rather 
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than to units, even if no attempt was made to estimate the new 
figure ; " between 5,000 and 6,000 are aflfect^d " is a more useful 
and correct statement than " 5,662 persons belonged in 1891 to 
a class in some undefined way connected with that in question 
in 1896." 

The discussion of Group C, the tabulation of non-numerical 
answers, must be postponed till we have analysed the nature and 
use of averages. 
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CHAPTER V. 

AVERAGES. 

It is natural, in a book with the present title, to allot a 
considerable space to averages. By the us^ of averages complex 
groups and large numbers are presented in a few significant 
words or figures ; and thus the two definitions of statistics, 
the Science of Averages and the Science of Large. Numbers^ are 
reconciled. 

Some writers have attempted to draw a distinction between 
averages and nieanSy but no general agreement has been reached 
▲▼«ngM and ^ to the exact senses in which the words are '^ 
"*••"• to be separately applied.* The best distinction^- 
may be made by deciding that an average is a purely arithmetical / 
conception, such as the average length of life in a varied popu- " 
lation, which does not correspond to any particular group, but I 
is only a short way of expressing an arithmetical result ; while j" 
the word " mean " is to be applied to some objective quantity, 
such as the mean height of Englishmen, about which all height- . 
measurements are grouped according to a definite law, which" 
will be discussed in the sequel.t 

A. Arithmetic Averages. — We may rapidly pass by 
some of the common uses of the word "average," and pick - 
out those which will prove of use in statistics. An average is 
sometimes used merely to save big figures. The average weight 
of the University crew is given, only because it is more usual ' 
to speak of a man's weight being \2\ stone than of eight men's 
weight being 12 J cwt, and it is easier to connect the former 
with men's weight in general. Similarly, if we are comparing / 
the value of the exportations of some commodity in two periods C 

• Compare the article " Moycnne," by Dr Bertillon, in Dictionaire 
encyclopidique des Sciences Midicales^ with this chapter. See also the 
paper by Dr Venn in the Statistical Journal^ 1891, and chap, xviii. in his 
Logic of Chance, 

t See Pan H, Section i, infra. 
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of ten years each, we should say that the yearly average in the / 
period 1870-79 was ;^io,ooo,ooo, and in 1880-89 was £1 1,000,000, / 
rather than that the totals were ;^ 100,000,000 and ;f 110,000,000. • 
This leads to the second ordinary use of the word. If we 
TbAoonuiion Were comparing the ten years 1870-79 with the : 
dMominator. eleven years 1880-90, and the totals in the periods / 
were jfi" 100,000,000 and i^ 13 2,000,000 respectively, we should., 
obtain no grasp of the difference till we had reduced them ^ 
to a common denominator by dividing by the number of._ 
years, and found that the averages in the two periods were 
;f 10,000,000 and <£" 1 2,000,000. This class of averages is well - 
known in cricket ; sometimes the total number of runs made f' 
or wickets taken by each cricketer are stated also, but these - 
are rather as so-called statistical curiosities than as having - 
much bearing on the skill or luck of the players. The numbers 
by which the seasons' performances are judged are the quotients '- 
of the number of runs by the number of innings, of the number . 
•of wickets by the number of runs, and so on, all quantities 
being reduced to a common denominator. A consideration 
of the best methods of comparing cricketers or counties, and 
an exposure of the fallacies inherent in the present system, : 

^ would afford a useful exercise in the use of averages and the 
choice of the most appropriate kind. The average in this 
sense is very common in mechanics. The average pressure 
per square inch, the average work done by an engine per " 

-' minute, the average speed of a train, are quantities which it 
is frequently necessary to use. Such an expression as the 
average rate of interest is precisely similar. . 

It will be clear that percentage is a special case of this / 
use of average. It is useless when comparing the growths! 
▲varages a«3 of population or of trade to give only the ^ 
rates. whole numbers. An increase of 50,000 in the^ 
\ population of London is not so significant as one of 10,000 
in that of Harrow ; they must be expressed as increases J 

y of I per cent, and 150 per cent, say, before their meaning . 
can be appreciated, and this is the same thing as giving the ^ 

M average increase to 100 inhabitants. For this reason the ^ 
records of births, deaths, and marriages are always given^i 

^as rates — so many per 1,000 inhabitants ; and in these cases 
a double average is given, for the rates signify so many perl 
1,000 inhabitants per annum. \ 
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Another extension of the same use is found when quan-T^ 
^titles are reduced to rates ** per head " of the population. This 

use is solely for comparison, and the principle employed is(r 
J^that of the common denominator. It would be futile to state 

that the amount spent on drink was, say, ;^ 100,000,000 in i860 ^ 
If and ;£4 10,000,000 in 1890; but the corresponding statements ^ 

that the amounts were £^. los. per head in i860 and £2, iSs.6 
l/'pcr head in 1890 would make a comparison possible. Or, to 

take a better instance : in studying the increments in the values f 
J of England's foreign trade, an entirely wrong view is obtained, 

unless we calculate for each year the value per head of the /fr 
J- population, instead of looking only at the totals. A neglect 

of this division would make municipal expenditure appear to J^ 
y- be growing much faster than it really is ; and in preparing 

any comparative summary of figures, it is always necessary^ 

7 to consider whether such an average should be taken. 
PreMmiiiAry So far, the averages considered are simply C 

definition. 2 arithmetical, and satisfy the following definition : — 

Average x number to which it applies = total quantity dealt with. ^ 
e.g, T Average weighfx number of crew = total weight of crew. 

Average value of imports per head of population x number of L 
I population = total value of imports, and so on. 

The following question, however, will lead us further. The 1 
Its inappuoft^ average weekly agricultural wages in 1892 in 
MM*y- Wilts, Dorset, Devon, Cornwall, and Somerset / 
^ were los., ids., 13s. 6d., 14s., iis. respectively. What was 
the average in the south-west of England ? 6 
U The simplest method is to say, the average was 

IPS. + los. + 13s. 6d. + 14s. + I IS. __ 58s. 6d. _ J jg o ^^ 
5 " 5 " ■ ' ' 

and for many purposes this would be sufficient ; but it does ^ 

7 not satisfy the above definition. For when we ask the double 
question "lis. 8.4d. multiplied by what number equals what V- 
U total ? ", we can only answer that 1 1%. 8.4d. multiplied by the 
number of items equals the sum of items. ^ 

We must consider further what we understand by the ex- 6 
^pressions "average wage in each county," and "average wage 
in the group of five counties." J 



Digitized by 



Google 



no ELEMENTS OF STATISTICS. 

It may be supposed that the average wage in Wilts, for 
instance, was compiled by getting returns from different villages, 
say I2S., IIS., 9s., 9s. 6d., los. 6d., 9s., 9s., adding them and 
dividing by the number of villages. This of course satisfies 
our definition no better than the former. What is to be 
understood by the average in each village? If our present 
definition is to be satisfied, it should be the total of the ws^es 
paid in the village divided by the number of workers. It is 
hardly necessary to say that this total is never found in such 
an investigation, and the average is given from observation or 
by guess-work, not by calculation. 

If, however, the village average was correct, and we had 
returns from all the villages in the county, we should find 
the county average as follows : — 

12/X200-Hl/X 150 + 9/ X 300 + 9/6 X 150+ 10/6 X 400 + 9/ X 200 + 9/ X 200 _ . ^ 
200+150+300+150 + 400 + 200 + 200 "~^' ■ ' 

where the numbers in the denominator are the numbers of 
labourers in the respective villages. We should then have the 
same result as if we had had the wages of all the labourers 
in the county put down on a sheet, added up, and divided by 
their number, and the average would satisfy the definition. 

It is clear that we can simplify this arithmetical work, 
for if we divide throughout by 50 we get the same result; 
this is as if we said there were 4, 3, 6 . . . labourers in the 
villages instead of 200, 150, .. . Thus we get the same 
result if we take numbers proportional to the total numbers 
of the labourers instead of the actual numbers. This plan 
has two advantages: first, that though we do not know the 
numbers of labourers, we know numbers nearly proportional 
to them, viz., those included in the census returns under the 
general headings relating to agriculture ; and secondly, we need 
not choose our numbers with absolute exactness; thus the 
numbers of labourers above given may be supposed to be round 
numbers substituted for 213, 145, 320 . . . ; and it will presently 
be seen that such differences" hardly affect the average. We 
idealize the village, and Appose it to contain round numbers ; 
and then for the numerical work take simple numbers pro- 
portional to these. This is important as simplifying numerical 
work. 

Averages obtained for the county in this way do not ab- 
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solutely satisfy our definition, but are very nearly equal to 
those that do. We can then proceed to take the average for 
the south-west of England on the same principles. 

B. Weighted Averages. — This discussion introduces and 
gives an example of the very important statistical method 
known as " weighting the average." We may illustrate it 
further from the same figures by considering what weights to 
apply to get this average for South-West England. We may find 
the number of agricultural labourers in the counties and work 
. . . ^, los. X 20,000 + los. X 30,000 -f ^ „.^ 

out the average thus : -^^0^3^'+ ^ """^ "^^ 

may argue that since we have no means of knowing the 
exact numbers of labourers we may as well arrange the 
weights, according to the importance of the counties, say 
20,000, 30,000, &c., from some other point of view, and 
take numbers representing such quantities as the amounts 
of wheat produced, the area, or . the rate of increase of 
population. In this particular case these methods would be 
absurd, but in other problems the weights are not so obvious. 
Suppose, for example, that we are considering the attraction 
of London on the inhabitants of various counties ; that we are 
told that so many immigrants arrive from Essex, Norfolk, and 
Suffolk, and so many from Stafford and Worcester, and we are 
asked to compare the attractive power on the agricultural and 
manufacturing counties. Should we weight the numbers given 
by the total numbers of inhabitants of the contributing counties, 
or by their distance from London, or by some quantity derived 
from these? 

A more practical problem, the classical and most useful 
application of weights, is the formation of an index number 
for the change of prices by fitting-suitable weights 
to the changes measured in the prices of various 
commodities. This will be considered separately,* but it is best 
to deal with the first principles here. It is required to find the 
change in the value of gold when measured by the prices of other 
commodities. Suppose that we are given that the prices of 
certain commodities between two years were in the following 
ratios : — 



* See t'n/ra^ Chap. IX. 
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Wheat. 


Silver. 


Meat. 


Sugar. 1 Cotton. 


First Year - 
Second Year 


ICX> 

77 


lOO 
60 


ICX> 
90 


100 lOO- 
40 85 



The simplest way to estimate for the general fall in price is 
to take the simple average of the numbers in the second year, 
viz., 70.4 ; and say that general prices in the second year were 
704 per cent of those in the first, and the value of gold had 
increased in the ratio 100:70.4 when expressed in commodities. 
But it is at once clear that we cannot allow the commodities 
given to have equal influences on the result ; wheat is of greater 
importance than sugar and meat than silver; and again we 
have taken arbitrarily three items to represent food and one for 
clothing ; we need some means of deciding relative importance- 
Suppose we decide that wheat, cotton, meat, and sugar are 
respectively 7, 4, 3 times and twice as important as silver, we 
should get the following table : — 



n^^^^i^. Relative Price in 
Commodity. Second Year. 


Weight Assigned. 


Product. 


Wheat - 

Silver .... 

Meat .... 

Sugar ... - 

Cotton 


77 
60 
90 
40 
85 

352 


7 
I 

3 

2 

4 
17 


^1? 

270 
80 

1289 



1280 
Weighted average is —- = 75.8[^ 

352 
Unweighted average *^ = 70.4 

This process is equivalent toVriting down' the price of wheat 
seven times, silver once, meat thrice, &c., and then taking the 
simple average of these numbers. 

The idea is made clearer by the mechanical analogy in which 
the word weight originated. Suppose a uniform weightless rigid 
rod graduated in 100 equal divisions, and equal 
weights hung at the 77th, 60th, 90th, 40th, and 
8sth divisions from one end ; the rod will then balance at a 
point corresponding to the unweighted average, 704 intervals 
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from the same end. Now, suppose the equal weights replaced 
by weights of 7, i, 3, 2, 4 lbs. respectively, and the rod will 
balance at a point corresponding to the weighted averages 
75.8 intervals from the same end. The further any particular 
mass is moved, or the heavier it is, the more the centre of 
gravity will be shifted; and this clearly corresponds to the 
influence we should wish the various prices to have in the 

statistical problem. The formula in use in Statics, Ic = -f~> 

which corresponds to the arithmetic on the previous page, can 
also be used in Statistics. 

The discussion of the proper weights to be used in this and 
other averages has occupied a space in statistical literature out 
of all proportion to its significance, for it may be said at once 
that no great importance need be attached to the special 
choice of weights ; one of the most convenient facts of 
Tha «i»^ii efl^ot statistical theory is that, given certain condi- 
of weij^to. tions, the same result is obtained whatever logical 
system of weights is applied. We must postpone the mathe- 
matical analysis of this proposition, but may offer immediately 
some arithmetical illustrations. 

The table on the next page affords an example of this prin- 
ciple,* and is worth careful study. At the commencement of the 
. Bzunioe ftom Wage Census, circulars were sent to all the principal 
the Wage oensiu. firms in all well-located trades, asking for details 
as to . wages. Of these some were not returned, and the 
numbers allotted in the Final Report to each trade are not the 
numbers which actually belong to the trade in the whole 
country, but the numbers of those in the firms which made 
returns. The average wage given is not therefore the arithmetic 
average for these trades for the whole country corresponding 
to the definition given above for average, but the aVerage of 
the average wages as returned in each trade weighted by the 
numbers for whom returns were made; so that the average 
wage given for the whole group of trades might have proved to 
be different, if with the same average in each trade the returns had 
been complete. It is very unlikely, however, that there would 
have been any great difference. In the table several systems of 
weighting are used ; the first are the numbers in these returns, 
giving an average, 24s. 7d. ; the second are the numbers be- 

* From the Statistical Journal^ December 1897, with corrections. 

H 
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Examples of the Smallness of the Change Introduced by 
Difference in Systems of Weighting. 



From the Wage Census. 




Numbers 












when 


Arbitiary 
System of 
Weights. 


Equal 
Wefghlft. 




Avenge 


Number 


Trade. 


Wag^ 


Included 


known. 






(Men). 


in Returns 


Unit x,ooo 








J. (L 










Cotton Manufacture - 


25 3 


32,189 


142 


144 




Woollen /r ... 


23 2 


12,248 


54 


172 




Worsted and Stuff Manufacture - 


23 4 


7,005 


38 


219 




Linen Manufacture - 


19 9 


6,807 


22 


96 




Jute » ... 
Hemp, &c, It ... 


19 4 


2.799 


9 


^l 




23 6 


1,232 


3 


2^ 




Silk 1, ... 


22 3 


2.248 


10 


189 




Carpet n ... 


26 7 


1,292 





213 




Hosiery n ... 


24 5 


1,070 


8 


287 




Lace n ... 


27 3 


593 


8 


51 




Small wares h ... 


20 2 


2,734 





225 




Flock and Shoddy Manufacture - 


21 2 


330 


2 


200 




Coal, Iron O^e, and Ironstone 












Mines - - , - • - 


22 II 


67,429 


57 


142 




Metalliferous Mines * - 


I6 6 


5.046 





190 




Shale Mines and Paraffin Oil Works 


25 


3»02i 





207 




Slate Mines and Quarries - 


22 I 


6,933 


\ 


232 




Granite Quarries and Works 


21 II 


2,315 


- 12 


206 




Stone Quarries .... 


23 10 


3,956 




34 




China, Clay, &c., Works - 


18 8 


499 





39 




Police- 


27 7 


52,682 


58 


224 




Roads, Pavements, and Sewers - 


20 9 


24,276 





29 




Gasworks 


27 2 


27,965 





40 




Waterworks .... 


24 9 


5,187 





151 




Pig Iron (Blast Furnaces) - 


24 6 


6,234 





i^ 




General Engineering Iron and 












Brass Foundries and Machinery 












Trades 


25 9 


41,658 


200 


'73 




Shipbuilding, Iron and Steel 


29 3 


10,661 


80 


228 




Tinplate Works .... 


33 5 


11,514 





178 




Saw Mills 


24 3 


2,088 





174 




Brass Works and Metal Wares - 




1.838 





222 




Shipbuilding, Wood - 


28 4 


454 





79 




Cooperage Works 


30 5 


327 





165 




Coach and Carnage Building 


26 6 


1,664 





28 




Boot and Shoe Making 


24 3 


2,902 





142 




Breweries 


24 3 


8,366 





46 




Distilleries 


20 4 


1,795 
3,x88 





129 




Brick and Tile, &c., Making 


22 10 





55 




Chemical Manure Works 


23 


1,054 





210 




Railway Carriage and Wagon 
Building .... 


25 2 


2,239 





233 


I 


s. d. 


J. d. 


5. d. 


J. d. 


Averages 


... 


24 7 


25 3 


24 54 


24 2 
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longing to each trade according to the census when they are above 
a certain minimum, giving an average 25s. 3d. ; the third is a 
purely arbitrary list of figures taken from a source which has no 
connection with wages, and the average is 24s. 5 Jd. ; the last is 
the unweighted average, that is, all the weights are equal, and the 
average is now 24s. 2d. The3e averages are close together, while 
the original items vary from i6s. 6d. to 30s. 5d. It is to be 
noticed that the true weights are not known in this case, but 
that owing to this principle we are able to dispense with them 
entirely. 

The problem dealt with in the next table is to find the aver- 
age weekly agricultural wage in England and Wales from the 

returns for Michaelmas 1869 and Lady Day 1870, 
wrm^B^nnder given in columns I and 2. There are very many 
manyvyitems different ways of taking this average, some of 

which are as follows : — Take the average of summer 
and autumn for each county, as in column 3, and then the un- 
weighted average of these 45 numbers ; this is 12s. yd. Suppose 
the summer wage to be paid twice as long as the autumn wage, 
as in column 4, and proceed as before; the average is 12s. Sjd., 
the slight difference being due to the inclusion of harvest pay- 
ments in the Michaelmas wage, which makes them higher on the 
whole than the summer wages. Again, divide the counties into 
geographical groups, take the simple average for each group 
(the figures marked a in column 3 and d in column 4) and 
weight these by the figures marked c in column 5, the numbers 
of agricultural labourers in each group ; the average of the a 
figures with the c weights is 12s. sd., of the 6 figures with the c 
weights is 12s. 4d. Again, weight the figures for each county 
in column 4 with the numbers in column 5, the most obvious 
method of all ; the average is then 12s. 4d. Again, take the 
simple average of the district averages a and ^, that is, give each 
of the eight districts equal weights; the averages are 12s. 4|d. 
and I2S. 3id. Or take the simple average of column 3, counting 
Yorkshire and Wales each as one county ; it is I2s. 8d. 

To obtain new groups, take as weights not the number of 
agricultural labourers, but the total population of the districts, 
the numbers marked d. Exclude the population of London as 
exerting a preponderating influence unconnected with agriculture. 
A new factor is now introduced, for population is greatest in the 
manufacturing districts, where agricultural labour is of compara- 
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Agricultural Wages in 1870. 
Illustrations of Various Methods of Weighting^ and their Results, 







I. 


2. 


3- 


Average 


No'.' of 


6. 
Whole 






Michael- 
mas 


Lady Day 
1870. 


Average 
of Cols. 


of 
C0L2X2 


Agricultural 
Labourers 


Population 
in Groups. 






1869. 


X and 2. 


and 


in Groups. 


Unit 












Col. X. 


Unit 1,000. 


100,000. 


*. rf. 


*. d. 


s. J. 


s. d. 






Sussex - 


- 


12 3 


12 


12 14 


12 I 


34 


... 


Surrey - 


- 


14 


13 6 


13 9 


13 8 


16 


... 


Kent - 


- 


14 6 


14 


14 3 


14 2 


44 


... 


Hants - 


- 


II 


10 6 


10 9 


10 8 


32 


... 


Berks - 


Average 


12 


10 


II 


10 8 


22 






... 


a 12 4i 


h\z 3 


c 148 


^22 


Herts - 


. 


14 7 


II 10 


13 2i 


12 9 


20 




Northants 


- 


12 6 


II 6 


12 


II 10 


23 


• .. 


Hunts • 


- 


16 


II 


13 6 


12 8 


9 




Bedford 


- 


13 


12 


12 6 


12 4 


17 


... 


Camb. - 


Average 


II 


12 


II 6 


II 8 


24 


... 


... 


... 


a 12 6 


^12 3 


c 93 


^14 


Essex • 


. 


12 6 


II 


II 9 


II 6 


45 


... 


Suffolk . 


- 


10 6 


II 


10 9 


10 10 


41 


... 


Norfolk 


Average 


II 6 


II 6 


II 6 


II 6 


44 




... 




a II 4 


^11 3 


c 130 


^12 


Wilts - 


. 


II 


10 3 


10 74 


10 6 


26 


... 


Dorset - 


- 


9 6 


10 3 


9ioh 


10 


17 


... 


Devon - 


- 


10 


xo 3 


10 14 


10 2 


34 


... 


Cornwall 


- 


II 


II 


II 


II 


17 


... 


Somerset 


Average 


II 


10 6 


10 9 


10 8 


31 


... 


... 




a 10 6 


3 10 6 


r 125 


</i9 


Stafford 




n 


n 


13 


13 


19 


... 


Gloucester 




II 9 


10 9 


" 3 


II I 


22 


... 


Hereford 




10 3 


10 


10 14 


II I 


12 


... 


Salop - 




II 


II 6 


" 3 


II 4 


21 


... 


Worcester 




13 6 


II 


12 3 


II 6 


15 


..« 


Warwick 


Average 


13 6 


12 


12 9 


12 6 


20 


... 




... 


a II 9 


*ii 7 


<: 109 


^27 


Leicester 


- 


14 


13 


13 6 


n 4 


15 


... 


Rutland 


- 


12 6 


12 


12 3 


12 2 


3 


... 


Lincoln 


- 


14 


13 6 


13 9 


13 8 


49 




Notts . 


- 


13 6 


13 


13 3 


13 2 


16 


... 


Derby - 


Average 


13 6 


14 


13 9 


13 10 


8 


... 


... 




«i3 3i 


*I3 3 


c 91 


rfl4 
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z. 


9. 


3- 


Average 


No.' of 


6. 
Whole 




Michael. 


Lady Day 
1870. 


Average 


of 


Agricultural 
labourers 


Population 




mas 


of Cols. 


Col. aX9 


in Groups. 




1869. 


I and 3. 


and 


in Groups. 


Unit 










Col. I. 


Unit 1,000. 


ioo,ooa 


*. d. 


s. d. 


*. d. 


s. d. 






Cheshire 


13 6 


13 6 


13 6 


13 6 


18 




LaDcs. - 


15 


15 


15 


15 


30 




Yorks, W. . 


19 


15 3 


17 It 


16 6 


30 




Yorks, N. - 


17 4 


13 6 


15 5 


14 9i 


16 


... 


Durham 


16 6 


16 


16 3 


16 2 


8 




Northumberland 


19 6 


16 6 


18 


17 6 


12 




Cumberland - 


15 . 


15 


15 


15 


10 




Westmoreland 
Average 


16 3 


15 6 


iSioi 


15 9 


3 






... 


fliS 9 


^15 6 


f 127 


dn 


Monmouth • 


12 6 


13 9 


13 li 


13 4 


6 


... 


Wales— 














Glamorgan 


14 6 


14 6 


14 6 


14 6 


5 




Caermarthen 


12 4 


II 6 


11 II 


II 9i 


4 




Pembroke - 


II 


10 


10 6 


10 4 


4 




Cardigan - 


9 


8 6 


8 9 


8 8 


5 




Brecknock • 


12 


12 


12 


12 


4 




Radnor 


10 


10 


10 


10 


2 




Carnarvon - 
Average 


12 


12 


12 


12 


5 


... 






a II 7 


^11 7 


05 


^14 



tively little importance, but receives high wages; these high 
wages have undue weight, and the average of the figures b with 
weights a is brought up to 13s. ifd. The "median" of the 
county averages in column 4 is 12s. id. If column 4 is rewritten 
correct only to the nearest is., and column 5 to the nearest 
10,000, the weighted average is 12s. 5d. If column 3 is 
weighted with random numbers quite unconnected with the 
problem, viz., the successive digits in the third decimal places 
of the logarithnis of the numbers 2 to 46, the average is 12s. 
lojd. The reader may try any other system of logical or 
absurd weights, and he will find that unless there is some bias 
in the selection of weights, or great preponderance is given to a 
few counties, that the average will be little affected. 

Since the true system of weights which would reduce the 
general average to our definition must be allied to some of those 
here adopted, and can hardly show greater divergence from 
I2s. 4d. than these do, we may feel confident that the true 
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average is within, say, 3d. of this figure. The original items 
varied from 8s. 6d. to 19s. ; the averages, even those based 
on the most extravagant methods, are contained by the limits 
I2s. and 13s. i|d. Without some such argument as this we 
should have no clue to the magnitude of the error introduced by 
erroneous weights. This is of the greater importance, because 
in many statistical questions the true weights are undefinable or 
incalculable ; now it is seen that, given certain conditions, there 
is no need to calculate or define the weights. Notice, however, 
that no system of weights can remove an original bias common 
to all the figures ; if, for example, winter wages throughout 
were is. less than here reckoned, the corresponding deficit would 
appear unchanged in all the averages found. So we arrive at a 
very important precept ; in calculating averages give all your 
care to making the items free from bias^ and leave the weights to 
take care of themselves, 

C. The Mode. — Passing now from the discussion of the 
arithmetic average and its development the weighted average, 
let us consider two other means in common use among statis- 
ticians but unfortunately not yet consciously introduced into 
common parlance. There are, however, some popular phrases 
which, if they have any definite meaning, very nearly resemble 
the averages in question. When we hear of the average clerk. 
The aTorago the average undergraduate, the average working- 
™*^ man, the phrases admit many interpretations. In 

some way these persons are supposed to be types of their kind. 
The average clerk may be supposed to mean the one who 
receives the average income of all clerks, whose expenditure on 
necessaries and on luxuries is the average of all of his class, who 
takes the average amount of interest in his work, is of average 
ability and average age, perhaps also of average height and 
weight It will be seen that this clerk is ideal, and not* to be 
found in any random assembly of half-a-dozen; for each 
of these will have some peculiarity, some quality in which he 
differs from the average ; the average man of the newspapers 
does not exist in the flesh, but is an imaginary person to whom 
certain attributes are attached. 

Quetelet's average man is familiar ; * he is of average height, 

* See Quetelet's Physique SociaJe; and Edgeworth in Statistical Journal^ 
December 1893. 
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The mode. 



weight, strength, girth and lung capacity, with eyes of normal 

Qofteiat'i range and medium tint; but he is a more satis- 
AYwage muL factory model' than the newspapers' average, for in 
regarding him we see the type from which all other men may be 
supposed to have deviated ; the creature that would have been 
produced if all disturbing causes were removed. That any actual 
person should answer exactly to all these standards is of course 
in the highest degree improbable. 

Quetelet refers neither to the arithmetic average, the mediar\. 
nor mode, but to a mean about which all the similar measure- 
ments are grouped in accordance with a definite law, the obedience, 
of anthropometrical measurements to which was his chief theme. 

The newspaper average, on the other hand, seems to be the 
mode, the position of the greatest density, which may be ex- 
plained as follows : — Referring back to the table of 
American wages, p. 91, or the table on next page. 
It will be noticed that in looking down column 2 we find the 
numbers increase till we come to 685 (between $1.15 and $1.24), 
and then after fluctuations diminish. This number, 685, is the 
greatest which occurs in any lo-cent group ; and its position is 
called the mode, or the position o( greatest density^ or the position 
of the maximum ordinate^ or the rate is spoken of as predominant. 

In this column '2 we have, however, 14 maxima in the 
correct sense of the word, the numbers rise and fall with little 

MMhod of regularity, and there are 14 modes of which that at 
detemonisg $i.i5-$i.24 is the most pronounced. But if the 
** groups are made wider, and the numbers entered 
as in column 6 in half-dollar limits, there are only three modes^ 
or if we neglect the small group of 8 at $5.00 only two. ^he 
position of the largest group of 1,472 is not at once assignable 
more closely than as between .75 and 1.25. We can get a little 
closer by the following method : — 



Numbers earning as much as $0.65, and not as much as $1 







0-75 






0.85 






0-95 






i.os 






i-iS 






1.25 






1-35 




» 


1-45 




» 


1-55 



15 



25 

35 
45 
55 
65 
75 
85 
1.95 
2.05 



944 
1,472 

1,458 

1,747 
2,012 
1,780 
1,297 

1,527 
1,127 

934 
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Determination of the Mode. 
Numbers of Wage-Earners from the Senate Report^ 1893, U,S,A, 



In 30-CEHT 
Groups. 



In so-Cknt 
Groups. 




725 



2,0X2. 



934 



640 



- 322 



285 



114 
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Now the greatest number is in the group $i.os-$i.55, and 
the "mode" may be stated as near the middle point of the 
group, viz., $1.30, not at this point, for there are only 99 wage- 
earners in the group $1.25-$!. 34. 

Another method of approximating to the mode may be. 
illustrated as follows : — When the numbers are tabulated in 
ID-cent groups, as on p. 91, the mode is quite indeterminate; 
in 20-cent groups the successive numbers beginning at .2S-.44 
are 16, 144, 270, 370, 989, 557, 538, 531, &c., and the number 
989 (in the group $i.o5-$i.24) is a distinct mode ; if we begin 
the 20-cent groups at .35-.54, the numbers are 74, 242, 282, 505, 
784, 924, 274, &c., and 924 (in the group $1.35-$!. 54) is a mode ; 
by this double tabulation it is seen that the 20-cent grouping 
does not decide the mode. In 30-cent groups we have 355, 674, 
1,242 ($1.15-$ 1. 44), 740, &c., if we begin with $.5S-$.84; we 
have 439, 1,190 ($.95-$i.24), 1,023, &c., if we begin with $.6s-$.94 ; 
and 483, 1,088 ($1.05 -$1.34), 996, &c., if we begin with $.75- 
$1.04: the modes by each of these groupings lies in a group 
which contains $i;i5 to $1.24, and this smaller group maybe 
assumed to contain the mode, which is thus at or near $1.20. 
The example here taken is drawn from a group of very irregular 
figures, which specially illustrate the difficulties. The method 
just adopted may be summarised thus : — Tabulate the figures 
again and again in gradually widening groups till regularity is 
obtained; then examine again the groups which have the selected 
width and see if the mode is shifted when the lower limit of the 
grouping is moved ; if it is shifted the groups are not wide 
enough ; if it is not, the mode is in the smallest group common 
to the larger equal groups which all contain it. A more accurate 
diagrammatic method is described on p. 154, 

Even when our numbers are initially regular, it is seldom 

indefliiitMiou ^^^^ ^^ determine the mode exactly. The diffi- 

of tbo pontioB culty is best seen by an example. Suppose that 

oftiMuode. ^^ j^^^^ ^j^^ following returns as to heights of 

a large number of men : — 

67 in. - - 455 

67i >, - - 475 

67I „ - - 490 

67I „ - - 500 

68 „ - - 485 
, 681 „ . - - 467 

68i „ - - 445 
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At first sight the mode appears to be at 6/1 in. exactly ; but it 
must be remembered that even in accurate measurements all 
heights within I in. of 67I in. will be entered as 67! if the 
measurements are taken to the nearest quarter inch, or will have 
been tabulated in this way if the measurements were more accu- 
rate. Hence 67! in. in reality stands for from 67! to 67^ in. 
If the 500 heights so entered were distributed uniformly through 
this interval, the mode might be given with 67! in. with fair 
accuracy; but there are signs in the figures that the mode is 
below this. Suppose that the figures in reality come from the 
following measurements : — 

From67ito67|in. »38 | g ^ g , ;„. 
„ 67f „ 67J „ 245 / ^ ^ '^ 

" Hi "III" !t^ } 495 at 671 „ 
» o7f „ 67} „ 250 J 

" 6J"ll^" !!w«3at67l.. 

n 071^ „ 68 „ 243 J 

„ 68 „ 68^ „ 242 

and that these had been tabulated as in the last column, the 
mode would appear as 67^ in. ; while the same figures tabulated 
as before gave it as 67! in. The probability of some such 
shifting is seen from the original grouping, where the number at 
6y^ in. is greater than that at 68 in. From this discussion we 
may see that the mode is always a little indefinite, depending on 
the width of the groups in which the items are tabulated, and on 
the exact position of the limits of the groups. As the items we 
deal with become more numerous, we shall find regularity when 
they are tabulated in narrower groups, and the mode can be 
assigned with greater accuracy. A more satisfactory method of 
determining the mode is that given on p. 155. 

Now is the "average workman" the man who earns $1.73 
per diem, the simple average of the whole group on p. 120, or a 
The "average ^^^ making $i.20 the mode? in ordinary speech 
™»^" the latter is meant. The "average clerk" is not 
the one whose measurable qualities are an arithmetic mean of 
all similar qualities, but one whose qualities are found in the 
same degree in the greatest number of hfs fellows. There are 
more clerks who read the evening paper than who read Homer, 
more who go to music-halls than to oratorios, more whose 
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incomes are £100 than ;^SOO, more who live four miles from 
the City than one or twenty. Even with this explanation the 
average man is not a real creature, for fortunately no individual 
has no qualities out of the common. The fact that the average 
is a pure abstraction is of importance directly we apply statistics 
to actual affairs ; these American workpeople cannot be legislated 
for in the mass as if they all earned $1.20, or as if those who 
were alike in this did not differ in other respects, even doing very 
varying quantities of work for this wage. No single measure- 
ment expresses completely even the economic condition of a 

importanoe of group of workmen, but if we are taking a single 
the mode. measurement, that of the "mode" is often the 
most useful. It is at the mode that we find the greatest number 
of whose greatest good we may be thinking. Whereas the 
arithmetic mean an<J the " median " may correspond to no reality 
but be merely numerical conceptions, the mode is precisely that 
number for which most instances can be found. It shows the 
commonest result, that most often obtained, and is of very 
general application. For an intending passenger by train or 'bus, 
it is more important to know the most ordinary than to know the 
average number in a compartment. The mode rather than the 
average in chest measurements is the number most suitable for 
the ready-made clothier. For providing a post-office or a store, 
the mode in postal orders or prices of tea needs to be known 
rather than any other average. Even the favourite coin in a 
collection may show the spirit of the congregation better than 
the arithmetic average of their contributions. In these last 
instances it may be noticed that the mode is quite definite. I. 

y A special feature of the mode is that it is entirely uninfluenced 
'by extremes. A cheque for ;f 1,000 in a collection disturbs the 

Adyantagei of arithmetic average, but not the mode. The incomes 
themoda of a small number of millionaires and an army of 
paupers may have the same arithmetic average as a nation com- 
posed entirely of people moderately well off; but the modes will 
be very different in the two cases. In considering the change 
year by year in a group of figures, as for instance, the wages of a 
large group of workmen, wc cannot tell, if we take the arithmetic 
average as our criterion, whether an improvement is due to a 
levelling up of the badly paid or a rapid increase for those who 
were already well off, while the mode will show the changing 
position of the main body. Mr Booth's "London'' is crowded 
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with instances of this maximum density method. Each age 
diagram shows the mode in ages for an occupation ; each wage 
list the mode in wages. His whole description of Class E, the 
typical workman of modem towns, is based on the same prin- 
ciple. His measurement of social status, based on the number 
of rooms occupied or servants employed, can be used more easily 
for stating the mode (four rooms to a family and no servant) 
than any other average. 

An objection to this average is that there are many groups 

of figures to which it is not applicable. If we have a very irre- 

' shortoomings of gular group of numbers with no particular type, 

tiwmode. j-uch as the populations of towns in England, 
the mode would be quite indefinite, or if found, would give no 
information of importance. The use of the mode is to indicate 
the type from which other figures may be regarded as diverging. 
Thus, in these wage figures, the type is about $1.20, and other 
examples lie on either side, wages of men who have for some 
reason or other above or below the normal degree of skill or 
opportunity. If there is a type, as in Quetelet's instances, the 
mode will show it. The mode only tells us one fact, however, 
about each type, whereas the methods already given (p. 92) show 
us several. 

D. The Median. — The median, with its dependents, the 
quartiles, deciles, and percentiles, has already been used on 
p. 92. Arrange all the items of the group in ascending order 
of magnitude ; the item half-way up the list is the median ; 
those one-quarter and three-quarters up are the quartiles ; those 
one, two . . . nine-tenths up are the deciles ; those one, two 
. . . ninety-nine hundreds up are the percentiles. The media; 
is the most useful of the averages ; so useful that it is worth 
AdTaatagM of effort to engraft the word and its meaning on the 
tue mediaa. public and official minds, where perhaps it may 
bear fruit by the year 2,000.* It is very nearly definite in positiogH 
thereby differing from the mode ; if we have an odd number of 
items, it is the middle one ; if an even number, it lies between 
the two middle items, which are generally very near together, or 
coincides with them if they |ire equal. It is not affected by 

* While this was in the press, Mr Wilson Fox's Report on the Agricul- 
tural Labourer was published ; on p. 25, the median is explicitly used. 
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exceptional entries at all ; the existence of any number of 
millionaires has no more effect on the median income than of an 
equal number of any other persons whose incomes are above the 
median. For many purposes it is of course necessary to allow 
- these extreme instances more weight than those which are nearer 
the average ; but the arithmetic average often gives them undue 
weight for this democratic age, since a single millionaire can 
counterbalance thousands of ordinary working men. A further 
advantage is that it is extremely simple to find, not needing 
much arithmetical work, for we need not do more than count 
those well above and well below the average, and look more 
carefully at those near it 

There is a yet more important advantage in the use of the 

median ; it can often be found exactly, when our information as 

No need for *^ ^^^ items in question is neither accurate nor 

oompiete inf or- complete. This will be clear from one or two 

"***°°* examples. It maybe that in the "wage census" 

100,000 persons, whose wages were far below the average, 

did not come into the returns at all, and it is very difficult 

to estimate their effect on the arithmetic average, for want 

of information as to their earnings ; but to find the median 

exactly, we need only know their number, not their earnings ; 

and if we can only assign a maximum for their number, we still 

can place the median within narrow limits. The addition of 

100,000 men with wages below 1 5s. to the general summary for 

the 356,000 men, would still leave the median in the group 

20s. to 25s. where it already is ; the change would be very 

marked, however, in the lower deciles and quartiles, and the 

arithmetic average would be lowered by at least 2s. id. The 

Asame argument applies to incomes ; information is often very 

Aleficient, but it is in many cases possible to assert that a number 

^imen, whose exact income is unknown, receive above a certain 

assigned sum, or even between two assigned limits, which is all 

we need to know about them to determine the median, if it lies 

^IPow the lower limit. 

Again, in tracing the history of wages throughout the century 
it is often very difficult to find the correct average, but at the 
same time it is frequently possible to say that a very large class 
of men earned below, say, 155. a ivveek, and another very large 
class above 30s. whose wages we do not exactly know, and a 
more definite number between 15s. and 20s., and 2Ss. and 30s. ; 
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and in order to find the median all we need to do is to investi- 
gate more exactly the wages between 20s. and 25s., and even if 
we have not complete information here, we can still say that^the 
median certainly lies between certain narrow limits. There is 
yet another advantage, perhaps more important, that the median 
indommaiifiir- 5s applicable to quantities which are not capable 
able quantities. q{ measurement at all. This development is especi- 
ally due to Mr F. Galton.* Suppose it to be required, for example, 
to find among a large class of boys the average in intelligence. 
It is clear that it is not easy to find the arithmetic average of a 
quantity which cannot be properly measured even by the most 
elaborate system of marks, but on the other hand it would not 
be at all difficult with a class of, say, twenty boys, to place them 
in order of intelligence without committing oneself to such a 
statement as that A/s cleverness was 25 per cent, more than 
B.'s; and the tenth or eleventh boy in this arrangement 
would show the style of boys in the class, at least as well 
as any other average. The disadvantage of this method, the 
reason why it is not universally applicable, is that the median 

of a series of observations may be totally removed 
Dliadvantagos. ^ . - 11 

from Its type, and in fact may not be situated near 

any of the different objects which are observed. Thus, if we 

had two large groups of wages of a thousand men between 1 5s. 

and 25s., and another thousand between 3Ss. and 45s., the median 

would give us any position between 2Ss. and 35s., where as a 

matter of fact not a single wage-earner would be found. The 

median is then chiefly useful when we are dealing with a series 

of objects of which the main part lie fairly close together ; a few 

extremes do not affect itf 

The following table shows the description of 76 items by the 

help of the various averages now described : — 

* See, for instance. Natural Inheritance^ p. 47. 

+ On the relative advantages of this, and a more mathematicJil method, 
see Yule and Galton in the Statistical Journal for 1896, especially pp. 
392-398. 
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Measurements of Boys of Ages 13 to 15 Years. 



No. 


Age. 


Height. 


Weight. 


No. 


Age. 


Height. 


Weight. 


Tabulation of 
Weights. 




yrs. mth. 


ft. In. 


St. lb. 


y«. mth. 


ft. in. 


St. lb. 




I 


14. 1 


4. 11 J 


6.oi 


39 


14.7 


4.II4 


6.34 


Arithmetic aver- 


2 


14.9 


4.10 


5-7 


40 


131 


4.x It 


I-'. 


age, 6 St. 14 lbs. 


. 3 


14.7 


5-5i 


7.5 


41 


14.3 


4.11 


6.4i 




4 


13. 1 1 


5.0 


6.3i 


42 


13.3 


4.44 


4.114 


The same, when 


5 


14. 1 1 


5-3i 


8.oi 


43 


14.3 


5-3, 


6.7i 


weights are en- 


6 


14.7 


4.10 


5.0 


44 


13.6 


5.14. 


6.i3i 


tered only to 


7 


14.3 


4.10 


6.7 


4§ 


14.2 


4.8J 


6.0J 


nearest stone, 


8 


14.9 


5.5 


8.5J 


46 


^H 


5.2 


7-4 


6 St. li lbs. 


9 


14. II 


:4.9i 


S.12J 


^l 


13.8 


5.24 


6.11 




10 


14.3 


"4.iif 


6<ii3 


48 


14.6 


5.4 


7.44 


Median, 6 stones 


II 


13-4 


4-7 


S'lh 


49 


14.8 


5.14 


6.10 


I J lbs. 


12 


14.7 


13 


^8i 


50 


13.3 


4.8i 


S-o 




13 


13-8 


Ik 


51 


13.0 


5.ii, 


6.7 


Quartiles, 6 st. 9} 


14 


14.5 


5.2i 


52 


13.10 


4. 1 14 
4. 1 14 


7. 3 J 


lbs., 5 St. 64 lbs. 


15 


14.4 


5.0 


6,0 


53 


14.8 


6.9i 




16 


13.6 


4.9 


5.6 


54 


13-8 


4.5i 


4.94- 


Average of quar- 


17 


14.0 


ii 


7,74 


55 


14.8 


5.44 


7.0 


tiles, 6 St. I lb. 


18 


I3-0 


5-3 


56 


14.0 


4.10 


6.24 




19 


14.7 


4. 1 1 


6.12i 


57 


13-10 


4.9 


5.5 


Half of the ex- 


20 


14.10 


5-1 


6.9 


58 


13.2 


5.04 


6.4 


amples lie within 


21 


13-9 


4.11 


5.11 


59 


13.6 


4-7 


5.24 


9 lbs. of median. 


22 


14.10 


4.8i 
4-9i 


5.11 


60 


13.0 


4.9 


5-9* 




23 


13-4 


5.8J 


61 


13.3 


4.8} 


5.5i 


Mode is between 


24 


I3-I 


5.2i 


6.1 


62 


13.5 


4.84 


6.5f 


6 St. and 6^ st 


25 


14.0 


4.6i 


5.64 


63 


13.10 


5.54 


7.104 




26 


14.6 


5.3i 


7.^4 


64 


13. 1 


4.8i 


6.2i 


Average weight 


27 


14.3 


5oi 


5.11S 


!5 


13-10 


5-4 


7-^t 


between ages 13 


28 


13.9 


4-9 


S-" 


66 


14.0 


4-9 


5.04 


and 134 years, 
5 St. 9i lbs.; 134 


29 


13-4 


S-ii 


5-9 


67 


13.3 


4.7 


5-0 


30 


14.4 


5-1 


6.8i 


68 


13.8 


4.11 


6.ii 


and 14 years, 5 St. 


31 


14.10 


4.94 


4.7i 


69 


13.7 


4.11J 


6.4I 


134 lbs. ; 14 and 


32 


132 


4.94 


5.i3i 


70 


13- 1 1 


4.8 


4*44 


14J years, 6 St. 34 


33 


14. 1 


4.8i 


S.84 


71 


13.11 


4.8 


4-44 


lbs.; 144 and 15 


34 


13.10 


5.2i^ 


6.8i 


72 


13.2 


4.78 


4.10 


years, 6 St. 8| lbs. 


35 


14.0 


4. 1 14 


5.7 


73 


14.0 


4.11 


6.5 




36 


14.4 


4.11 


6.5 


74 


13.3 


4-34 


4. 1 J 


Heights may be 


37 


14.8 


4.11 


6.oi 


75 


13-3 


5-2 


7-2i 


tabulated in the 


38 


13.7 


5.0S 


-6.2 


76 


13.7 


4.8J 


5-6 


same way. 







^ 












: ^ 



A graphic method of finding the median closely is given by 
oraiAiio Mr Gal ton in the Report of the Anthropometric 
metbod. Committee of the British Association,' 1881, p. 247 ; 
and is illustrated by the diagram facing the next page. 

On a horizontal line mark off equal intervals represent- 
ing units of measurement, say inches. On a vertical scale 
mark off equal intervals representing the number of instances, 
e.g,^ persons whose heights are measured. Beginning at the 
lowest, say, 51 J inches, on an imaginary vertical line mgrk as 
many dots at equal intervals on the vertical scale as there are 
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persons at that height, so that each dot represents one person. 
From the highest dot thus marked, suppose a horizontal line 
drawn till it is over the next height division, 51 J inches, and 
with this new base proceed as before, marking each instance 
at SiJ inches by a dot vertically above the SiJ-inch mark. 
Next draw a connected line through the middle points of the 
consecutive vertical rows of dots ; if there is an odd number 
of dots, the middle one is taken ks the middle point ; if an even 
number, the middle point is half-way between the-middle ones. 

On the vertical scale mark the positions of the median, 
quartiles, &c., obtained by dividing the distance representing 
the total number of instances into appropriate parts, and 
through these points draw horizontal lines to intersect the 
connected line already drawn. The points of intersection 
lie vertically above the heights required, as .marked on the 
horizontal scale. 

Now it may be assumed that the heights of all persons 
returned at, say, 58f inches, are in reality evenly distributed 
between the limits 58f and 58^^ inches, heights lying within 
which would be so returned ; and it can be verified that the 
construction just given shows the place of the median, deciles, 
&c., almost exactly on this hypothesis. 

E. Geometric Mean, — It is not necessary to give a long 
discussion of the geometric or logarithmic mean, for its applica- 
tion is limited to a small class of figures which will be best 
dealt with at a later stage.* It was used by Jevons in 
his essay on the Fa// in the Va/ue of Go/d, but he did 
not justify or explain its use. If we have n quantities 
^1, a^^ , . . d:„, their geometric mean is V^i- ^2- • • • ^n- Its 
chief advantage is that the influence of large numbers is 
diminished and of small numbers increased, when the geo- 
metric mean is en^loyed instead of the simple average. 
Suppose all the following groups of numbers represent price 
levels of various commodities as percentages of their height 
at a previous date : — 



Numbers. 
80, 160 
80, 80, 100, 324 
20, 20, 80, 80, 120, 120 

20, 20, 80, 80, xoo, 100, 120, 160, 324, 972 


Arithmetic 

Mean. 

120 

146 

73.3 
198 


Geometric 
Mean. 

"3 
120 

5/ 
104 


♦ See p. 223, infra. 
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GRAPHIC METHOD OF FINDING MEDIAN, QUARTILES AND 

DECILES (after Galton : Athropometric Committee : BriL Assn.). 
For the Heights of the 76 boys, between ages of 13 and 15-, stated on p. 127. 
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51 in. ^2 53 54 55 56 57 SB 58 60 61 6Z 63 64- 65 & 



Median 59^ inches. 

Quartiles S^*^- 

'Probable error' 2.2, 
Deciles 55.5, 56.6, 57, 57.9, 
63.6,62^60.6, 59.7. 



Arithmetic average, 59.095. 
Greatest density 57 or 59. 
„ „ in smoothed 

curve would be about 58. 
Geometric average 58.98. 
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A consideration of the last list leads to the conclusion that the 
general rise of price cannot be 98 per cent, while 4 per cent, may 
reasonably represent it. A tentative rule may be suggested : when 
the geometric mean differs much from the arithmetic it should be 
preferred. It should be calculated with the' help of logarithms. 

F. Statistical Coefficients. — Before leaving the sub- 
ject of averagjes, we must pay some attention to "statistical 
coefficients." [A statistical coeffi cient is a number, whole or 
fractional, by which a, total (^.^., population) must be multi- 
plied to give an allied number (e.g-.y number of births). Thus, 
if the birth-rate is 40 per 1,000, the coefficient is .04. Thfese 
coefficients play an important part in ordinary statistics and 
a very interesting rSU in the application of the law of error 
to demography. The population may increase or diminish, 
but the coefficients relating to certain numbers remain almost 
unchanged,* and by their use the statistics of different coun- 
tries may be compared, and numbers for future years can 
be forecasted in some cases with marvellous accuracy ; the 
numbers of births, marriages, deaths in 1901 can be written 
down before their occurrence as exactly as they are needed, 
subject only to the chance of some great catastrophe. Coeffi- 
cients can be formed for births (in various districts), for deaths 
(according to age, profession, or disease), for marriages (at 
various ages), for suicides, crimes, accidents, consumption of 
various commodities ; if the preliminary data could be obtained, 
for the number of persons crossing Westminster Bridge in the 
year, the number of visitors to the Monument, the number of 
umbrellas left in the train, and so on ; the list could be 
prolonged indefinitely. The more important coefficients are 
calculated for most civilised countries, and the rates on which 
they are based published in statistical abstracts. A knowledge 
of them is necessary for statistical investigations. 

A useful caution is given by Dr Bertillon.t In order that 

a coefficient may obey the laws of coefficients closely, the 

caiovution of number to which it is to be applied should not 

ooeffloieiita. b^ ^hat of the total population, but the number 

of persons or things capable of affording an instance of the 

resulting total. Suppose m to be this number of persons, c the 

coefficient, n the resulting total, then n=^cm and ^=£' Thus, 
* See tn/ra, Part II. t Cours ili'tnentaire^ p. 94 seq, 
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for the marriage rate, m should not be the total population, 
but the number of marriageable people. The importance of 
this rule is, however, not great as far as simple calculations 
are concerned ; for the less accurate coefficient can be easily 
seen to be nearly constant. Suppose M to be the total popu- 
lation, m the number of marriageable people, n the number, 
of marriages. Let q be the coefficient for calculating the 

number of marriageable people, then ^1 = ^; ^2 *^^ coefficient 

for marriages on Bertillon's principle, then ^2=£- Let ^8=^1 the 

more usual coefficient for marriages. ^8 = ~X ^=^ri X r^. Now 

if q and c^ are invariable, so also is c^. If, however, one of 
the factors, say c-^, is more variable than the other, then c^ 
varies as much as c^, and the greater constancy of c^ is not 
discovered, if only c^ is calculated. 

G. General. — The function of averages will now be clear ; 

it is to express a complex group by a few simple numbers. The 

Tho funotton of niind cannot grasp the magnitudes of millions of 

areragei. items at once ; they must be grouped, simplified, 
averaged. The averages chosen must be those which will give 
the striking features and the essential characteristics of the group. 
Different methods will apply to groups of various classes ; each 
must be taken on its own merits. A good and suitable average 
has the following characteristics :— 7/" there is a type it shows it ; 
it gives due influence to extreme cases ; it is not eerily affected by 
errors or much displaced by slight alterations in systetns of calcu- 
lation ; and it is easily calculated. 

The relative positions of the different kinds of averages dealt 
with gives some information as to the general nature of the group 
to which they refer. The arithmetic average, median and mode, 
are close together, if the group is symmetrical. The arithmetic 
average is probably above the median, if we have a small group 
at a high d^ree. The arithmetic average is generally below the 
median, if there is an absence of high numbers, and a concen- 
tration a little above the average. The mode will be badly 
defined, if our group is not homogeneous. The mode will pro- 
bably be below the arithmetic average, if there is a small group 
at a high degree. The mode is well marked, if the distribution is 
uniform. These rules are only tentative and easily nullified by 
exceptional circumstances. 
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SOME EXAMPLES OF THE USE OF 
AVERAGES IN TABULATION. 
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CHAPTER VI. 



to train seirloe. 



SOME EXAMPLES OF THE USE OF AVERAGES IN 
TABULATION. 

If our analysis of the nature and use of averages is complete, 
Apioioation of and if averages are of widely extended use, we 
averages should now be able to express almost any group 
of figures by a few well-chosen numbers of definite significance* 

To apply a somewhat severe test at first, let us choose 
a familiar example from ordinary life, and consider how a 
suburban business man might test the merits of 
two railway systems, by one of which he intended 
to take a season ticket. 

The following table gives the train service between Leather- 
head and London in 1898 : — 

Train Service — Leatherhead to London. 
Number of Minutes to Journey, 
Waterloo— i ^ J n 

Down—eo, 50, 52, 48, 47, 61, 50, 44, 48, 53, 45, 42, 45, 49, 43, 48, 42, 43: 

Sundays— ^o, 50, 47, 49, 50. 
C^/— 51, 46, SI, 48, 43. 44, 48, 48, 64, 45, 48, 47, 45, 47, 46, 47. 
Sunddys-^^ 48, 51, 51, 51. 
London Bridge— 

Z)<mw— 67, 65, 65, 61, 74, SI, s6, 66, 6s, SZ^ 59, 41, 49, 44, 58, 57, 5^, 67, 80. 

Sundays— dT, 52, 66, 68, 88, 65, 6s, 68, 6s. 
^—69, 57, 53, 58, 54, 41, 58, 52, 42, 40, S5, 67, 79, 98, 69, 66, 68, 64, 71. 
Sundays— y2, 71, 69, 70, 62, 81, 73, 73. 
Victoria— 

Davm—JT, 6s, 55, 76, 77, 88, 48, 53, 46, 69, 89, S4, 82, 71, 9a 

Sundays— ^2, 4s, 81, 84, 78, 61, ^s* &3, 8s. 
£^—87, 65, 69, 69, 47, 48, SI, 83, loi, S8, 62, 61, 76, 103. 
Sundays— ^i, 76, 80, 8s, 8s, 82, 94. 

The following table gives us the necessary information : — 





London 
Bridge. 


Victoria. 


Waterloo. 


Average of four quickest trains - 
Lower decile - - - - 

Median 

Mode 

Number of trains on week days- 
General average 


Min. 

1'' 
1 

63 


Min. 
46i 
48 

77 

29 

73 


Min. 
42* 

48 

48 
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It IS to be noticed that the statistical method is generally 
limited to one aspect of a problem ; the question of punctuality 
might, indeed, be easily treated statistically, but the questions 
of comfort and relative picturesqueness of route will elude our 
analysis. 

The next example shows a method of throwing into relief 
the characteristics of a typical group of sociological data. 

The adjoining table gives the wages recognised by the 
Tabvifttion of Amalgamated Society of Engineers in many of 
wagM retnnu. ^heir branches in 1862 and 1891. 

Amalgamated Society of Engineers. — Wages in 1862 and 1891, 
Weekly, exclusive of Overtime. 







X862. 


xSoi. 




Z862. 


1891. 






J. 


d. 


J. d. 




5, d. 


s. d. 


Accrington 


- 


- 27 





31 


Faversham 


- 34 


33 


Ashford - 


- 


- 33 


6 


30 


Folkestone 


- 34 


32 


Ashton-under- 
Bacup 


Lyne 


: .1 


3 

I 


34 
28 


Frome 


- 24 


'27 
I30 


Barrow-in-Furness 


- 31 





34 9 


Gainsborough - 


- 27 6 


28 


Barry 


- 


- 29 





31 


Glossop - 


- 27 2 


32 


Bath- 




- 27 





29 


Gloucester 


- 28 


32 


Bilston 




- 28 





30 


Grantham - 


- 28 6 


30 4 


Bingley - 




- 24 





29 


Grimsby - 


- 28 


32 


Birkenhead 




- 29 





35 6 


Halifax - 


- 23 I 


31 


Birmingham 




- 32 





36 


Hanley - 


- 28 3 


32 


Blackburn 




- 27 


6 


32 


Hartlepool 


- 26 


34 10 


Bolton 




- 27 


6 


28 
32 


Heywood - 


. 27 


'30 
34 


Bridgwater 
Brighton - 




- 24 


6 


24 


Holyhead - 


- 32 


28 




- 24 


H 


29 


Huddersfield - 


- 26 


26 


Bristol 




■ 31 





32 


Hull - 


- 27 6 


34 


Burnley - 




■ 27 





30 


Hyde 


(30 
128 


30 


Burton-on -Tren t 


• 25 





30 


28 


Bury 




- 28 


3 


'30 
132 


Ipswich - 
Keighley - 


- 28 6 


28 
27 


Cardiff - 




- 31 





34 


Kidderminster - 


30 


Carlisle - 




- 24 


6 


30 


Lancaster - 


- 25 


32 


Chepstow - 




■ 30 





34 


Leeds 


- 25 


30 


Chester - 




- 30 





32 


Leicester - 


. 26 


31 6 


Chowbent - 




- 26 





32 


Leigh 


- 27 9 


31 6 


Colne 




■ 25 





31 


Lincoln 


- 26 7 


28 6 


Congleton 




- 24 





28 


Liverpool - 
Llanelly - 


- 29 


34 


Coventry - 




- 28 





34 


• 22 


26 


Crewe 




- 29 


4 


30 


Macclesfield 


- 24 


29 6« 


Darlington 




- 25 





31 6 


Manchester 


- 29 9 


35 


Dartford - 




- 34 





38 


Mexborough 


- 27 


32 Q 


Darwen - 




■ 27 





32 


Middlesborough 


. 25 


34 


Derby 




- 26 





29 


Middleton- 




33 


Doncaster - 




- 28 


6 


31 6 


Milton and Elsecar 


- 28 


34 


Dover 




- 35 


6 


36 


Neath 


- 32 


30 


Enfield T/>rk 




- 36 





.^2 ^ 


Newark - 


. 25 


29 


Exeter 




• 23 





r28 
152 


Newcastle - 


. 25 


J35 
\37 
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1862. Z89Z. 





s. 


d. 


s, d. 


New Holland - 


• 30 


8 


34 


Newport - 


- 30 





32 


New Town (Stockport) 29 





32 


Newton Abbott - 


- 33 





33 


Northampton - 


• 26 





32 


Northfleet - 


- 36 





36 


North and So. Shields 26 





35 


Norwich - 


- 32 





29 


Nottingham 


- 27 


5 


34 


Oldbury - 


- 28 





34 


Oldham - 


- 29 





33 


Peterborough • 


- 28 


6 


33 


Plymouth - 


- 32 





33 


Pontypridd 


- 24 





30 


Portsmouth 


- 35 





34 


Preston - 


- 27 





32 


Radcliffe Bridge 


• 27 





/30 
I32 


Reading - 


- 28 





J32 
134 


Ripley . 


- 26 





26 6 


Rotherham 


■ 27 


6 


/35 ° 


Rugby 


- 32 





' 28 

32 


Rugeley - 


■ 24 


II 


30 


St Helens - 


- 28 





.34 

136 

36 


Sheffield ■ 


. 28 





Shipley - - 


- 25 


9 


f28 
1 30 


Shrewsbury 


• 30 


6 


32 


Smethwick 


- 28 





35 


Southampton - 


• 32 





34 6 


Sowerby Bridge 


- 24 


6 


30 



Stafford - 
Stalybridge 

Stockport - 

Stockton-on-Tees 
Stoke-on-Trent - 
Stroud and Thrupp 
Swindon - 
Todmorden 
Wakefield- 
Warrington 
Watford - 
Wednesbury 

Whitehaven 

Wigan 

Wolverhampton 

Wolverton 

Worcester - 

Bermondsey 

Blackwall - 

Bow - 

Greenwich 

King's Cross 

Lambeth - 

London, £. 
„ N. - 
» S. - 
„ W. - 

Marylebone 

Stratford - 

Tower Hamlets 
Woolwich • 



Z862. 

s. d. 
34 o 
28 3 

28 o 



Z89Z. 

X. d. 
30 o 



24 
29 
26 

31 
26 

Ik 



25 o 

28 o 
• 28 o 

2 

o 

4 
o 
o 
o 
o 
8 
o 



29 
31 

35 

34 

36 

34 

36 

35 

35 

35 10 

35 o 



35 
33 

r35 
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36 



32 





32 





34 





36 





32 





30 





31 


6 


28 





30 





34 





36 





31 





28 





36 





34 





33 





29 





30 






38 o 



The following figures show the same in. brief : — 





X. 


9. 


3. 




x86«.* 


x89i.« 


x89X.t 
S. d. 




J. d. 


J. d. 


Maximum .... 


36 6 


40 6 


• •• 


Upper decile ... - 


35 


38 


38 


Upper quartile 


3i 4 - 


34 


36 


Median 


28 


32 


34 3 


Arithmetic average - 


28 10 


^ 32 4 


33 4 


Modes 


28 


30 
. 32 


... 


Lower quartile 

Lower decile - - - - 


26 


30 


31 6 


24 6 


28 6 


30 


Minimum .... 


22 


24 


... 



* Each branch counting as I. 

t The numbers of members in each branch counted as receiving 
the wage recognised there. 
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If the rates at each branch were not those actually paid to 
all members, but their average, while the actual wages were 
confined within small limits of that average, the figures in the 
last column would be little affected. 

On comparing columns i and 2 it will be seen that not 
only have all the averages increased, but that since the lower 
decile and quartile have increased more rapidly than the upper, 
the lower half has also gained on the upper. Again the wages 
are grouped more closely in column 2 than in column i. 

It is important to choose a simple measure of the disper- 
sion of a group that can be easily appreciated and calculated, 
MMsnro of that varies with sufficient sensitiveness with change 

dinwwion. in dispersion, and can be applied generally in com- 
parison of group with group. The following satisfies all these 
conditions : express half the distance between the quartiles as a 
fraction of the arithmetic average ; this fraction measures the 
dispersion. For the above figures this quantity is — 

^^- ^'^•=.092 in 1862, and —^=.062 in 189 1. 



28s. lod. '^^ * 32s. 4d. ' 

The dispersion, therefore, dinlinished in that period. A more 
satisfactory and complete measurement, of which this is an 
adaptation, is discussed in Part II. 

Group C of Tabulation. — It was necessary to postpone 

the tabulation of non-numerical or descriptive answers till we 

Tabulation of ^^^ finished our discussion of averages. We have 

degoriptive now seen that the median and allied quantities can 

answers. j^ applied in many unexpected ways ; and the 

following detailed example shows how they can be used to give 

a short description of a large group of adjectival answers. 

In 1 891 the Amalgamated Society of Engineers obtained 

from all their branches answers to the question : To what extent 

is overtime worked ? The branch secretaries seint answers which 

may be tabulated as follows : — 

Answers. 

None - 

Not worked - 
Very little 

To very limited extent 
Very occasionally 
A little on repairs 
Little - 



Number of 
Branches. 


Number of 
Members. 


4 
I 

23 

I 


140 

4,836 
63 


I 


350 


1 


500 


2 


73 
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Answers. 


Number of 


Number of 


Branches. 


Members. 


2 hours when necessary 


I 


80 


Seldom 


I 


59 


Small extent - 


I 


16 


Seldom except on repairs 


I 


66 


Only on repairs 


2 


216 


Not much 


6 


1,125 


On repairs 


I 


500 


Not to any extent 


3 


644 


Not to a great extent - 


2 


162 


Not general - 


I 


7 


Not systematically 


2 


43 


In cases of breakdown or emergency 


T - 7 


606 


2 hours regularly 


I 


136 


Chiefly on repairs 


I 


20 


Occasionally - 


2 


90 


When necessary 




348 


Casually (sic) - 


2 


142 


A good deal on repairs 


I 


23 


Maximum 18 hours in 4 weeks 


I 


1,000 


Moderately - 


3 


262 


Systematically in good trade - 


I 


200 


Average about 5 hours a week 


I 


96 


Considerably in marine shops 


I 


400 


Systematically in dockyard 


I 


650 


General 


2 


146 


Systematically 


I 


693 


Great amount - 


I 


263 


To a great extent 


I 


72 


Excessively - 


I 


550 


9 hours a week 


I 


39 


10 „ - - 


- ' I 


106 


12 „ (maximum) - 


I 


700 


14 „ (when busy) - 


I 


106 


10 to 18 hours a week - 


- 88 


5,000 


Total - 

MPT AQQim * 


20,666 


No answers - 


- 36 


5,"4 


As little as possible - 


I 


250 


Not so much lately - 


I 


160 


In machine shops for six months 


1 


60 


In steel works - - - - 


r 


348 
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An inspection of the table here given will show sufficiently 
the method of tabulation. The position of most of the answers 
Exidanation of >" an imaginary scale is fairly definite, except that 
table. it is not always obvious where the numerical 
answers should be placed ; this must be decided either by internal 
evidence or practical knowledge of the trade. The same adjec- 
tives did not of course convey exactly the same numerical 
meaning to all the branch secretaries who used them, but it will 
be admitted that this tabulation gives a fairly clear view of the 
case, and that the method of medians and quartiles may be 
appropriately applied. Taking the member of a branch as the 
unit and neglecting the unclassed answers, the median is 
"Maximum 18 hours in 4 weeks" or "moderately," the lower 
quartile "Very little," and the upper quartile "14 hours when 
busy." Taking the branch as unit, the median is " Not much," 
the quartiles are " Ver>- little " and " When necessary " or 
" Occasionally." 

This method, which, with varying degrees of precision, is 
widely applicable, seems to afford the only way of comparing 
two such groups of answers. The precision attainable is to be 
measured by the distance through which the median can be 
shifted by making reasonable variations in the scheme of 
tabulation. 

Summarisation. — Now that we have the method of averages 
at our disposal we may use it for tabulating and summarising a 
group of figures. 

Consider, for example, the answers to the questions issued 
by the Commissioners on Trade Depression in 1886. 

Four of the questions were : — 

1. Number of men in Society. 

2. Number out of work in 1885. 

3. Weekly wage in 1885. 

4. Change in wages between 1865 and 1885. 

The following table shows the answers given by the branch 
secretaries of the Amalgamated Society of Engineers : — 
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X. 


No. in 


No. Ont 


Current 


5. 


District. 


DUtrict, 
1885. 


of Work, 

1885. 


^»r 


Change between 1865 and 1885. 


Bcl&st . 


1,100 


130 


28/ to 36/ 


Slight increase. 


Coventry 


2,500 


230 


31/6 


Contract work— 50 % de- 
crease. 


Dukinfield - 


170+ 


20 + 


31/ 

25/ikaied. 
15/aiBkilled. 


Slight increase. 


Dnndec 


1,400 


457o 


Time work--i865, 22/ ; '72, 








24/; '80, 26/; '83, 24/; 










'85. 25/. 


Glasgow 


28,000 


4,000 


26/ 


Time wages, 5 % above 
1864. 


Glasgow (St Rollox) 


1,600 


250 


••• 


Rise in 1872-73 of 15 7„; 
1885 same as 1865. 


Hartlepool - 


1,200 


400 


31/6 


Advance of 3/. 


Glossop 


'M 


10 


32/ 




Liverpool 


38 




RiseinY872-73*of 7J'A; 










1885 same as 1865. 


Monifieth • 


114 


18 


ai/ 


Skilled work— 1865, 24/; 
76. 27/; '78. 25/; '83, 










Nottingham - 


4,000 


600 


34/ ounimu m. 


28/ J '85,25/. 
1865,28/; 1885,34/. 


Oldham 


1,600 


96 


33/ average. 


Increase of 5 7o« 


Oxford- . - 


45 




% 





Paisley - 


800 




1865,26/; 1885,28/6. 


Preston 


630 


40 


28/ 


None. 


Preston 


900 


120 


^IL 


None. 


Shipley 


201 


15 


28/6 

•4/lKMl.UDlOttIstt. 


1865, 28/6; 1869.73, 32/; 

1885, 28/6. 
1865-75.25/6; 1875-85,28/. 


Sowerby Bridge - 


1,120 


43 


28/ 


Sunderland - 


3,200 


400 


33/ 


1864,27/; '74. 34/; 1875- 
85, between 31/ and 37/. 


Swindon 


6,050 


2 


31/6 




Ulverston 


45 


... 


31/ 


1865, 26/ri875, 31/. 


Wednesbury - 


400 


30 


o3^/ , 


Increase of 2/. 


Workington - 


170 


70 


28 to 36/ 


Increase of 30 7o« 



It is suggested that the following are the summary tables 
which should be inserted in a report dealing with the answers. 

The figures are given here for only one society, but the 
tabulations are framed so as to include all. 





TABLE L- 


-State of Employment. 


, 


Naae of Society. 


Total Number*" 
in Branches 

on Employment. 


Number Out of 
Work. 


Percentage Out 
ofw3rk. 


Median of the 

Percentages Out 

ofWorkinthe 

Various Branches. 


A.S.E. 

O.S.B. 

&c. 


55ii7o 


7,142 


13 


12 



* Details of some of the most important branches should be added. 
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TABLE XL— Current Wages. 



Name of Society. 




in Branches. 


Quartiles of 
Branch Wages. 


Measure of Dis- 
persion (v. p. 136). 


Unweighted. 


Weighted. 


A.S.E. 


#. d, 
30 




#. d. 
29 7 


*. d, s, d. 
28 32 


A 


O.S.R 






- 






&c. 













TABLE III. 
A, Change of Wage between 1865 and 1885. 



Name 

of 
Society. 


Number of Branches showing 


Median 
of Per- 
centage 
Increases. 


Percentages of Members in Branches 
showing 


No 
Answer. 


De- 
creaM. 


No 
Change. 


Increase. 


No 
Answer. 

II 


De- 
crease. 


No 
Change. 


Increase. 


A.S.E. 

O.S.B. 

&c. 


4 


I 


5 


13 


10 


4 6 


79 



r, 



Verbal Summary. — In the great majority of cases a con- 
siderable increase of wage took place between 1865 and 1885, 
equivalent on the whole to a rise of about 10 per cent. The 
figures are not sufficiently definite to give an exact average. 

Table III. — B. Change of Wage between 1865 and the 
Maximum about 1873. 

Table III. — C Change of Wage between Maximum about 

1873 AND 1885. 

(Tabulation as in III. A.) 
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CHAPTER VII. 

THE GRAPHIC METHOD. 

I. General Purpose. 

The two main methods of elementary statistics which ought to 
be understood by all students or officials who handle figures, 
which are easily within the grasp of all independently of mathe- 
matical training, but are generally misunderstood or ignored by 
the uninterested or the uninitiated, are the method of averages 
and the method of diagrams or the graphic method. These two 
are placed together because the uses of averages and diagrams 
are nearly related. When we deal with large and complex 
Avwagef and masses of figures we are unable to grasp them in 
^^^vn^"^ their entirety, however clearly they may be tabu- 
lated. Any list of figures — the populations of different towns, 
the death-rates at successive ages, the wages of many work- 
people, the imports for a series of years — becomes less compre- 
hensible as its length increases. A series of ten numbers can, 
perhaps, be easily grasped, of twenty only with an effort ; while 
a printed list of figures for one hundred successive years leaves 
hardly any impression on our mind at all ; we cannot see the 
wood for the trees. The test to which all questions as to the 
use of averages should be referred is that the averages selected 
should afford the best summary of the whole group in question 
that the mind can grasp. When the meaning of the word 
average was sufficiently extended, we found that we could select 
three, four, or even ten suitable figures which adequately showed 
the main features of any group. The main use of diagrams is 
also to present large groups of figures so that they shall be 
intelligible in their entirety, and the test for all diagrams is that 
the diagram as drawn should afford the best view of the series 
or group of figures that the eye can appreciate. Diagrams have 
one use which averages have not, for it is only by a diagram that 
a series of figures relating to successive years can be adequately 
presented ; but in reality they are 4ess essential than averages, for 
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the latter often have an existence independently of the figures 
from which they are derived, representing true types of the 
quantities which are being measured; and by their use alone 
are further comparisons of complex groups made possible : while 
diagrams, on the other hand, might be dispensed with, being 
auxiliary rather than essential, merely an aid to the eye and 
a means of saving time. 

To connect this chapter more closely with the preceding, we 
orapuo ^^'^ show how the same group of figures, for 
repTMentation example the wages of a large group of workpeople, 
of sTeragee. ^^^ ^^ represented by either method. 
Consider the following data : — 



Numbers of workpeople earning — 



om 15/ to 


16/ 


200' 




From 25/ 1 


to 26/ 


- I,200' 




„ 16/ „ 


17/ 


400 




- „ 26/ „ 27/ 


800 




» 17/ » 


,8/ 


100 


-1,000 


"» 27/ „ 28/ 


- 700 


■3.500 


» 18/ „ 


19/ 


100 




. „ 28/ „ 29/ 


- 500 




.. 19/ » 


20/ 


200 




n 29/ „ 30/ 


- 300J 




» 20/ „ 


21/ 


200' 




„ 30/ » 31/ 


- 3OO) 


„ 21/ )• 


22/ 


- 300 




» 31/ » 32/ 


- 400 




» 22/ » 


23/ 


300 


■2,200 


,» 32/ „ 33/ 


400 


■ 2,100 


» 23/ „ 


24/ 


500 




» 33/ n 34/ 


500 




» 24/ „ 


25/ 


900 




» 34/ n 35/ 


5OO- 








From 35/ to 36/ 


600' 










„ 36/ „ 37/ 


400 










» 37/ » 3^/ 


100 


1,200 








»i 38/ " 39/ 


80 










» 39/ 


, 40/ 


20^ 









Using the method of averages we should replace this group by 
the following figures : — 



J. 


a. 


27 


6 


17 





36 


6 


27 






or 



Average of all 

„ lowest 1,000- 
„ highest 1,000 

„ middle 4,000 



Median, 26/9 ; quartiles, 24/2, 32/. 

Deciles, 20/, 23/6, 24/9, 25/8, 26/9, 28/2, 31/, 33/4, 35/4. 

Mode, 25/3; secondary positions, 16/6, 36/. 



or 

Persons earning from 

Percentages of all - 



15/ to 20/ 20/ to 25/ 25/ to 30/ 30/ to 35/ 35/ to 40/ 
10 22 3S 21 12 



Digitized by VjOOQIC 




Digitized by 



Google 



ENTATION OF WAGE STATISTICS. 



' 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 T^ 1 1 1 1 


r" " ' 


■ 


I 


J|C 




II* 




It 




f J 




1 J 




1 1 




Til 




1 ll 




~~^ 1 I 1 




1 4 ^ll • 1 


1 


11 ' \ 




1 nil 




', 1^ < 1 




1 1 I 1 




• It ! 1 




t 1 J 1 1 


' 


1 ' ' E 




' 1 ' 1 L 




1 k 


i 


1 Ql ' \in 1 


— n 


■ \ 




1 ) \ 




1 1 1 \^ 




1 Y 




1 1 j 1 j 




' J t -i- 4- X 




1 1 1 1 1 




, / J 1 


_T5 _ i 


T 'II 




I J 1 


II i/H 


! ^ j 


*" ^I +yp\ 






L 1 L i 


J-snE 


4- - ^T 3t.-t 4- i 


1 l! I 


IT it-t t + 


n / \ 


Ul t IJII 4 


\ "in 1 iV 




:j 2' Z4X li 4 Ti+I 


m!»^i i -.^--■'- ^ ■<- ^ 


\ JS-' 1 


X IT _. . ji^ _ . r it J 


\ y '1 


A / 1 i 


:..[ 1.4 ~! t: 


itt it Ji _r._^ ' i 


- A -<ut 4 1 t 


iLlTtit . ,4: JLlt-t ^ T 


If I 


T ^ ,-T^ T • 


. ' 4 H i ip 


t i • IC -t t I i 




1 / 


1 [ 1 


-t - JP 4^-t 4^4^ -4 


! 1 1 1 


i^ -ih/t ^-t 4- it4^ 


1 L 


Jt _2tL ± ._t t 1 ' 


* i L _j 


t 434- It.: n t I 


'1 t I 


tt ^4^ -t-t i: t„ : 


1 1 


.- 1 -.XX X-t t t 


! I 


1 / 


'- X 4 M ' X 


I, -r ! 1 ' 


-~\ - n — - ~i — r~ — 


i- L 4 t i- J- 4 


4 1 ^ :: 


t _t iL._^ nil 1 


j: luT j ^s t 


j ' ! 


It it d ^».t 


.It ■" 4-1 1 I. t-' I 


:t4- 4 j " T ^a 


- _ iji it.-i; t i i 


1 ' 1 ! tj 


R 17 le IS 20 21 ZZ 23 24- 25 26 27 26 

, D, D,Q,D3D, M Vd 


» 23 30 31 32 33 34 3S 36 37 96 3atUi 

•s D, Q3 D. D, "^ 



Digitized by VjOOQ IC 

To face t>o£c idK. 



GRAPHIC METHOD. 1 45 

This group is represented on the annexed diagram, an 
example of the graphic representation of the relation between 
constrnotion *^^ variable quantities. A figure similar to this 
of Bixniae may be used to show birth, marriage, or death 
iifA^amf j-^^gg at different years, numbers of persons of 
various statures, demand at different prices, or any such group 
of homogeneous quantities. The same construction can be 
used to show the changing values of any number in a series 
of years. Draw a line parallel to the bottom of the page, and 
mark equal intervals to represent a quantity which can have 
many successive small increments, such as age, income, height, 
price, time, and so on. This is called the axis of abscissce^ 
and the distance of a point measured from the zero position 
along the line is called its abscissa. At right angles to this 
line, parallel to the side of the paper, through the zero position 
we draw another, called the axis of ordinateSy and grade this 
to correspond to the numbers possessing the qualities repre- 
sented by the abscissae ; at each grade on the axis of abscissae, 
draw lines at right angles to it, to represent on the chosen scale 
the numbers at that grade ; these lines are called the ordinates. 
In the annexed diagram the abscissae represent the amounts 
of wages, the ordinates the number of persons earning them. 
Join the tops of the ordinates by straight lines and the diagram 
is complete. In practice, when squared paper is used, without 
drawing the ordinates their tops can be marked. 

This diagram shows at one glance the distribution of the 

ws^e-eamers according to their wages. A small number earned 

DeMripttoa between iss. and i6s., a slightly larger group 

ofthewaga between i6s. and 17s., very few between 17s. and 

****'*^ 19s. Above 19s. the number continually rises ; 

high numbers are found from 24s. to 27s., the hfghest between 

25s. and 26s. The line falls to the 30s. group, but not so low 

as between 17s. and 19s., then it rises regularly to 36s., and 

falls rapidly to 39s. Here, then, we have the main group 

congregated in the neighbourhood of 2Ss., a distinct but smaller 

group at 36s., and a small and nearly isolated group at i6s. ; 

representing a considerable group of highly-skilled men between 

30S. and 40s., the great mass with ordinary skill between 20s. 

and 30s., and a small group of incompetents at i6s. These 

features would not be so easily seen from the tabulated figures. 

It is to be noticed that the number tabulated as between 

K 
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155. and i6s. is represented by the ordinate at 15s. 6d., the 
middle of the interval ; if the original figures on which the 
table was based had been given to the nearest id., the ordinate 
should be drawn at 15s. S^d.* It is important that these middle 
points should be accurately placed. 

The use of the line joining the tops of the ordinates is two- 
fold. First, it enables the eye to judge relative heights more 
easily ; and secondly, it suggests the idea of con- 
tinuity, which can be better illustrated by the next 
diagram. In this the abscissas represent ages, the ordinates 
the estimated numbers of persons living at and above the ages 
at which they stand per million inhabitants of England and 
Wales at the middle of the year 1891. The ordinates were 
drawn at the points on the axis of abscissae representing the 
middle of each year of age ; but length of life cannot be 
expressed exactly in years, or even in months, days, or minutes. 
The intention of the diagram is to show the proportion living 
above each age, and for this purpose the joining line should 
have no breaks or sharp angles, but should suggest absolute 
continuity. 

In practice, it is useless to mark in the points for smaller 
intervals than 'a year, for the eye could not grasp the detail. 
It is, however, implied that the line drawn has the same shape 
as that which would result if the number of persons was infinite 
and the subdivision by age infinitesimal. 

Estimated number per 1,000 of the population at and above — 



Age. 



1,000 


Ages. 
16 


Ages. 
628 32 


Age.. 
346 49 


152 


% 


47 


I 


973 


17 


607 33 


332 50 


143 


66 


43 


2 


949 


18 


587 34 


. 318 51 


135 


67 


38 


3 


925 


19 


567 35 


.305 52 


127 


68 


34 


4 


§°' 


20 


547 36 


292 53 


119 


69 


31 


5 


!77 


21 


528 37 


280 54 


112 


70 


27 


6 


854 


22 


510 38 


268 55 


104 


71 


24 


7 


830 


23 


491 39 


256 56 


98 


72 


21 


8 


807 


24 


474 40 


244 57 


91 


73 


18 


9 


783 


^5 


456 41 


233 58 


85 


74 


15 


10 


760 


26 


439 42 


222 59 


79 


75 


13 


II 


738 


27 


423 43 


211 60 


73 


76 


II 


12 


715 


28 


407 44 


201 61 


67 


77 


9 


13 


693 


29 


391 45 


191 62 


62 


78 


8 


14 


671 


30 


376 46 


181 63 


57 


79 


6 


15 


649 


31 


361 47 
48 


171 64 
161 


52 


80 


5 








Calculated from the Census of 1891 


• 












* See p. 


88, supra. 
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Numbers Surviving at each Age in a Generation of 1,000. 

Numbers. 
1,000 



800 



600 



400 



200 






Ages. 



30 



40 



50 



60 



70 



80 



Apply these remarks to the diagram facing p. 145. Average 
earnings for a year will not be reckoned exactly by shillings 
or even pence ; if we had a sufficient number of instances we 
should get regular sequences of earners at successive farthings, 
and the line representing them would have no sharp angles, 
but be continually curved. The figure rightly gives the eye 
this impression of continuousness. Similarly in the diagram 
representing exports facing p. 151, the line correctly gives the 
impression that exports are continuous day by day. 

By an obvious step we may suppose that the unit of area^ that 
contained between vertical lines through two consecutive divisions 
on the axis of abscissa, and horizontal lines through 
two consecutive divisions on the axis of ordinates, 
represents one wage-earner, and it is then easy to see that the 
area contained between the base line, the curve, and two vertical 
lines through the points marking any two amounts of wage re- 
presents the total number earning rates between those amounts. 

Hence the lines (see p. 145) through M, the position of 
the median, Q^, Q3 those of the quartiles, D^, D2, D3, D^, D^, D^, 
Dg, Dg of the deciles divide the area ABwjWgWgCD into two, 
four, and ten equal areas respectively. The centre of gravity 
of this figure lies on the vertical line through V, the average 
wage; and n^, n^, n^ the feet of the ordinates through the 
highest points m^^ nt^, m^ are at the modes. 



Area. 
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The details of technique of diagram drawing, the position 
of the scales, the devices for making the figure clear, and so 

Beqniiite on, can be gathered from the various diagrams 

•«niraoy. given in this chapter. The degree of accuracy to 
which the figures should be marked, whether correct to a 
million, a thousand, or a unit, is determined simply by the 
power of the eye to grasp detail ; in most of those here given 
it will be found that a displacement of one in a thousand is 
perceptible, and this is the ordinary limit More minute accuracy 
is useless, for it is not the function of "diagrams to dispense 
with lists of numbers, but only to enable the eye to perceive 
their significant features. 

Before discussing the choice of scales on which the numbers 
are to be represented, it is necessary to consider the ways in 
which a diagram makes an impression on the eye. 
The eye can judge — (i) Distances; (2) ratios; (3) 
angles. The dotted lines in the diagram facing p. 151 will illus- 
trate these points. ( i .) The eye is a fairly safe judge of distances ; 
there is very little doubt which of two points is the further 
from the base line ; when squared paper is used, a difference 
of I in 1,000 is perceptible. The eye can also judge differences 
quickly. In the figure the value of the exports in 1883 exceeded 
that in 1885 by more than the value in 1890 exceeded that in 
1883. .(2.) It can be 'quickly seen that the value of exports 
doubled between 1862 and 1889; or that the value in 1878 is three- 
quarters of that in 1890. The accuracy with which the eye can 
make such measurements is not great ; it is not easy to detect 
that the ratio of the values in 1873 and 1871 (1.095 : is greater 
than the ratio of the values in 1882 and 1880 (1.073 • J but the 
general impression given by the diagram is partly made up by 
unconscious calculations of this nature. To make these obser- 
vations accurately the method described on pp. 188-9 should be 
used. Notice that for these observations the insertion of the 
base line is necessary ; and, because they are made unconsciously, 
there are very few cases where a diagram without a base line 
gives a correct impression. (3.) The question, Was the increment 
greater in 1887-88 or in 1888-89? can be more quickly answered 
by observing the angles than by noting the differences. The 
line showing the latter change is steeper (makes a greater angle 
with the horizontal) than the line showing the former. Hence 
the latter increase is the greater; actually ;f 14,400,000 against 
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jf 12,600,000. The most useful exercise of this power, however, 
is to judge the dates at which the rate of increase changed ; thus 
the value of exports increased in 1862-63, increased at a slower 
rate in 1863-64, and slower yet in 1864-65, more rapidly in 
1865-66 ; a slow fall followed in 1866-67, then an increase began 
which is continually accelerated to 1871, and so on. The line 
from 1872-76 is concave to the base line, showing^ an accelerated 
fall ; the concavity from 1879 to 1882 corresponds to a retarded 
rise ; at 1888 convexity gives place to concavity, for at that date 
the rate of increase began to diminish. > 

It is difficult to lay down rules for the pr0per choice of the 
scales by which the figure should be plotted out. It is only the 
ohoioeof ^^^^^ between the horizontal and vertical scales 
scale. that need be considered. The figure must be 
sufficiently small for the whole of it to be visible at once ; if the 
figure is complicated, relating to a long series of years and vary- 
ing numbers, minute accuracy must be sacrificed to this con- 
, sideration. /Supposing the horizontal scale decided, the vertical 
f scale must be chosen so that the part of the line which shows 
the greatest rate of increase is well inclined to the vertical, 
which can be managed by naaking the scale sufficiently small ; 
and, on the other hand, all important fluctuations must be 
clearly visible, for which the scale may need to be increased. 
Any scale which satisfies both these conditions will fulfil its 
purpose. The annexed page shows the erroneous impressions 
which can be given by a judicious manipulation of the scale 
and by the omission of the base line. The diagrams, which 
are drawn roughly, all represent the same estimates of wages in 
England and in the United States of America for certain years 
from i860. Figure i sets the lines in proper relief. In figure 2, 

Neoessityof ^^^ ^^^^ '^"^ ^^ ^^^ drawn in the zero position 
oorreot for the English scale, and the American scale is 
Daaeiine. reduced; the consequence is that English wages 
appear to have fluctuated widely, while American made steady 
progress. In figures 3, 4, and 5 the scales are doctored and the 
base line adjusted, so that in 3 American wages seem to have 
caught up English, in 5 exactly the reverse is the case, while in 
4 wages appear to have moved with equal rapidity in both 
countries. An examination of these figures will show that the 
eye cannot be trusted to supply the right base line, or to 
estimate the importance of fluctuations without it ; and, with 
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certain exceptions to be mentioned later * it is well to distrust 
all those numerous diagrams, where space has been economised 
at the expense of the base line. 

Total Declared Real Value of British and Irish Produce 
Exported from the United Kingdom. i =;^i,ooo,ooo. 



• 




Averages. 






Averages. | 


Three 


Five 


T«fn 


Three 


Five 


Ten 






Yearly. 


Yearly. 


Yearly. 






Yearly. 


Yearly. 


Yearly. 


i8S5 


9S.7 


... 


... 


... 


1878 


192.8 


197.4 


210.9 


218.0 


1856 


1 15.8 




... 


... 


1879 


191.5 


194.4 


201.4 


218. 1 


1857 


1 22.0 


III. 2 


..% 




1880 


223.1 


202.5 


201.3 


220.5 


1858 


1 16.6 


118.1 


... 


... 


1881 


234.0 


216.2 


208.2 


221.6 


1859 


130.4 


123.0 


1 16. 1 




1882 


241.5 


232.9 


216.7 


220.1 


i860 


135-9 


127.6 


124.1 




1883 


239.8 


238.4 


226.0 


218.6 


1861 


125. 1 


130.5 


126.0 


... 


1884 


233-0 


238.1 


234.3 


217.9 


1862 


124.0 


128.3 


126.4 




1885 


213- 1 


228.6 


232.3 


216.9 


1863 


146.S 


131.9 


132.4 


... 


1886 


212.7 


219.6 


228.0 


218. 1 


1864 


160.4 


143-7 


138.4 


127.2 


1887 


221.9 


215.6 


224.1 


220.4 


1865 


165^ 


157.6 


144.4 


134.3 


1888 


234.5 


223.0 


223.0 


224.5 


1866 


188.9 


^71.7 


157.2 


141. 6 


1889 


248.9 


235.1 


226.2 


230.2 


1867 


181. 


178.6 


168.7 


147-5 
^53.8 


1890 


263.5 


249.0 


236.3 


234.2 


1868 


179.7 


'l^'l 


175- 1 


1891 


247.2 


2532 


243.2 


235-5 


1869 


190.0 


'IH 


181.0 


159.8 


1892 


227.1 


245.9 


244.2 


234.1 


1870 


199.6 


189.8 


187.8 


165.9 


1893 


218.1 


230.8 


240.9 


231.9 


187 1 


223.1 


204.2 


194.6 


175.7 


1894 


215.8 


220.3 


234.3 
226.8 


230.2 


1872 


256.3 


226.3 


209.7 


188.9 


1895 


225.9 


219.9 


231.4 


1873 


255.2 


244.9 


224.8 


200.0 


1896 


240.1 


227.3 


"5*1 


234.1 


1874 


239.6 


250.4 


234.7 


207.9 


1897 


234.3 


233.4 


226.8 


235.4 


1875 


223.5 


239.4 


239.6 


213.7 


1898 


233.4^ 


235.9 


229.8 


235.3 


1876 


200.6 


221.0 


235.1 


214.9 


1899 


255.4* 


241 


237.8 


236.1 


1877 


198.9 


207.7 


223.7 


216.7 













* Not including the newly reckoned value of ships exported. 

We can now pass on to the consideration of the smooth- 
ing of curves, for which purpose the question of the " alleged 

Smoothing Station ariness of our exports," discussed by Sir R. 
^"^^^ Giflfen in his paper before the Royal Statistical 
Society in 1899, affords an excellent illustration. The thin 
dotted line on the diagram opposite shows the value of exports 
year by year, and the first impression given by it is that exports 
have not grown in value in recent years. Sir Robert Giffen 
gave the following table : — 

Average Annual Value of Exports. 



1855-57 - 
1865-67 - 

1875-77 - 
1885-87 - 
1895-97 - 



^134,000,000 
228,000,000 
264,000,000 
274,000,000 
292,000,000 



* See pp. 188-194, tn/ra. 
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and from thfs he deduced " that all through there is an increase, 
and that the only sign of stationariness is an increase at a less 
rate in the last periods than in the earlier periods." 

The Saturday Review * wrote " that such a conclusion is 
grossly misleading," for the figures are merely triennial averages 
of selected years showing a happy coincidence ; " why was not 
1898 included?" An inspection of the numbers does not show 
us the answer to this criticism, but on the diagram the whole 
circumstances are visible at a glance. Since 1865 three great 
waves have been completed. The maximum of 1872, due to the 
inflated prices of that year, is very high, but that of 1890 is 
greater than any previous figure, while the maximum in 1882 is 
comparatively low. The minima increase throughout; those 
of 1868, 1879, 1886 show a r^ular progression, which falls off 
greatly in 189 1. In 1894-96 it looked as if another decennial 
cycle was in progress, but this has been checked. Since the 
discussion the returns for 1899 show an increase which brings the 
figure for that year very near the maximum of 1872. 

The Saturday Review went on to*ask why Sir Robert Giffen 
did not give " proper quinquennial averages," such as — 

Average Annual Value of Exports. 

1870-74 ;£235>o90|Ooo 

1880-84 234,000,000 

1890-94 234,000,000 

1898 233,000,000 

and it must be granted that this gives an appearance dia- 
metrically opposite to that of the previous table. 

It is clear that we need some general method of bringing 
these figures into a form which shall be quite independent of the 
choice of any special years. The diagram facing page 151 does 
this. The thick continuous line, lying almost over the dotted line 
of annual values, shows triennial averages taken yearly, that 
is the average of each year with those before and after it ; this 
line smooths off the corners without affecting the general appear- 
ance. The line of crosses shows quinquennial averages, each 
year being averaged with the two previous and two subsequent 
years. The line of circles shows decennial averages; each circle 
is placed at the centre of the period whose average it represents ; 

* January 1899, pp. 66, 67. 
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thus the circle showing the average of the ten years 1875-84 is 
placed vertically over the line separating the years 1879 and 
1 88a* 

On looking at the line of quinquennial averages it is clear 
that the Saturday Review did precisely what it accused Sir 
onoiMof Robert Giffen of doing, for years are taken which 
portodi. favour the argument. The quinquennial periods 
selected for comparison with 1898 are all on the upper parts 
of the waves, the marks showing these averages are very near 
the maxima of the quinquennial line, while the year 1898 does 
not appear to be a maximum. We might with just as much or 
as little accuracy give the following : — 

Quinquennial Averages of the Values of Exports. . 

1865-69 ;;^i 8 1,000,000 

1875-79 201,000,000 

1885-89 226,000,000 

1898 233,000,000 

and say that the value in 1898 was higher than any of the pre- 
vious selected averages. There is no need to use arbitrary dates 
to get at the facts. No argument can stand which does not take 
account of the cycle of trade, which is not eliminated till we 
take decennial averages. Special marks in the diagram show 
the averages for 1859-68, 1869-78, 1879-88, 1889-98, and indicate 
a rapid increase before 1870, and a steady slower progress since. 
The complete line gives just the same general appearance. If, 
finally, the figures were completely smoothed by a freehand line 
keeping as close to this as was possible, without making sudden 
changes of curvature, the same appearance would be given ; the 
thick line on the diagram is an attempt to do this. The smooth- 
ing is obtained by the assumption that the cycle of trade is ten 
years ; when two maxima fall within the same ten years the 
average of this period by our construction gives the appearance 
of a maximum {e,g,^ in 1887) at a date of a minimum. This 
would be avoided if we continually changed our period for 
averaging to accommodate the changing wave-length, a some- 
what arbitrary proceeding. The difficulty thus arising can be 
easily corrected by the eye, and the final smoothed line is 
intended to convey this corrected impression. 

* In all the curves of averages the mark showing the average is placed at 
the centre of gravity of the marks showing the 3, 5, or 10 quantities averaged. 
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154 ELEMENTS OF STATISTICS. 

It should be clear now that it was in 1899 five years too 
soon to pay attention to the particular figure for 1 898 ; the 
figures for the next five years, necessary to determine the char- 
acter of the coming wave, could not be foretold. When the figure 
for 1899 (not represented on the diagram) is included, the new 
decennial average (1890-99) is the highest on record, while the 
actual value for the year 1899 has only been exceeded in 1890 
and 1873. It will be seen, moreover, that the sentence quoted 
from Sir Robert Giffen on p. 152 is fully justified. 

The smoothed line now constructed represents the general 
tendency of the value of exports, when accidental and tempo 

Meaning of rary variations are removed. If it were possible 
imootii Use. to separate entirely variations of short period from 
secular changes, to separate the ebb and flow of the tide of 
commerce from the steady current of increasing trade, we may 
suppose that we should obtain a result represented by this line. 
In it there are no sudden changes even in rates of growth, while 
the addition and subtraction year by year of relatively small 
quantities would produce precisely that irregular fluctuating line 
from which the smooth line was obtained. 

The fuller discussion of "smoothing" series of figures be- 
longs to the chapter on interpolation, but one other group may 
BmoothiDga ^^^^ ^ considered, as showing the use of the 
homogenoou graphic method for obtaining regularity out of 
*"*^ irregular raw material. Referring back to the 
figures given on p. 120, the wages of 5,000 workers can be 
expressed anew by a diagram, in which the ordinates represent 
the numbers earning at or above a certain wsige. The thin 
angular line on the adjacent page represents these numbers, 
entered for every lo-cent group. This plan is especially useful 
for irregular figures, like this wage-group, for the line must 
always tend upwards from the numbers earning the highest 
wage to the numbers earning at least the lowest. The diagram 
is also at once adaptable to the graphic method of finding the 
median described on p. 127. 

The irregularities shown by the thin line do not arise from 
any law of wage-grouping, but are due to the accidents of obser- 
vation ; if we regard these returns as samples out of a much 
larger unregistered group, we may suppose that a smoothed 
curve will indicate approximately the form which would be 
obtained, if our returns were complete. To smooth this figure, 
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GRAPHIC METHOD. 155 

draw a freehand line passing as near the points as possible 
without abrupt changes of curvature, as in the annexed diagram. 
A new approximation may be made for the median, quartiles, 
Gmphio method ^^^ ^y drawing horizontal lines through the points 
of flnduig the on the vertical scale corresponding to half, one- 
quarter, three-quarters, &c., of th^ workers ; from 
the points where these cross the smooth line, draw vertical lines 
to the scale of dollars ; the points on the scale so obtained are 
the median (quartile,. &c.) wage. 

The results obtained are : — 





Median. 


Quartile. 


Quartile 


Given on p. 92 


$1.49 


... 


... 


By method of p. 128, used 








in annexed diagram 


$1.49 


$1.16 


$2.12 


From smooth curve in an- 








nexed diagram - 


$1.51. 


$1.15 


$2.13 


By method of interpolation. 








P- 253 - 


$1,536 


... 





This method is not, however, one of great precision ; a very slight 
change in the curvature of the smoothed line would make more 
difference than those shown between the second and third lines 
in the above table. 

This method is more useful for determining the mode. It 
will be remembered that the difficulties in doing this before 
Graphicmethod ^^ose from the uneven distribution on the two 
of flBdiDg the sides of the mode, and in the displacement of the 
mode by the adoption of a second system of 
tabulation. The first of these difficulties entirely disappears 
in the graphic method, while the second is diminished, for 
the displacement now only depends on the slight possible 
variations in the curvature of the smooth line. The mode is 
clearly the position where the greatest number is added, in the 
present method of representing the figures : that is, the mode is 
where the line, angular or smooth, is steepest On the smooth 
curve the maximum steepness is where the tangent crosses the 
curve, — in mathematical language, at a point of inflexion. This 
can be determined mechanically by placing a ruler to touch the 
curve, and turning it round the curve till it crosses it. On the 
annexed figure this occurs in the interval between $1.10 to $1.40. 
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A more complex method of determining both mode and median, 
is discussed in Chap. X. 

This graphic way of finding these averages has two gfreat 
advantages. It can be applied to numbers which are given 
at irregular intervals of gfraduation {e,g.y 30 at 30s. 6d., 40 at 
30s. 8Jd., 35 at 40s, id., &c.) as easily and by exactly the same 
construction as to more regular returns ; and if the smooth curve 
is carefully drawn, the number of modes can be seen at a glance 
and the individual importance of each can be estimated. In 
the annexed diagram, the curve is concave to the base line from 
$.30 to about $1.20, convex from about $1.20 to $3.15, concave till 
$3.40, and then convex till the end. The points of inflexion or 
the modes are where concavity gives way to convexity. Hence 
there are two modes, of which that near $3.4 is of the less 
importance. The mathematical method of pp. 252-4 shows them 
to be at $1.10 and $3.20. 



A large class of diagrams may be passed by with a few 
words. Writers and lecturers frequently use points, lines. 

Pictorial triangles, squares, circles, even pictures, of diffe- 

^^•«^*^ rent sizes to assist the presentation of the rela- 
tive magnitude of numbers. These have their use for popular 
lectures and hand-books, but do not add anything to the signi- 
ficance of the figures. Collections of these may be found in 
the second volume of Gabaglio's Teoria Generate della Statistical 
and in M. Levasseur's La Statistique Graphique in the Jubitee 
Votume of the Royal Statistical Society. 

Of these one group may be signalled as of practical use. 
Rectangles may be used to express three quantities : one side 
to represent price ; the adjacent side, quantity ; and the area, 
value : or number of houses, average number of inmates and 
population ; or number of hours' work per week, average output 
or hourly wage, and total output or weekly wage. The figures 
on the annexed page show the limit to which this method can 
be usefully pushed. 
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Representation of Three Facts by Rectangles. 

Imaginary budgets of an artisan and a labourer, showing amounts 
spent weekly on various commodities, and number of hours' work 
necessary for each amount. 



Per week, 
£1. 13s. 4d. 



The horizontal scale 
represents pence per hour. 
.125 inch = id. 

The vertical scale re- 
presents number of hours 
per week. .1 inch = 2 hours. 

The areas represent 
amounts spent, and the 
whole rectangles show the 
week's wages on the same 
scale. I sq. in. = 13s. 4d. 




4d. per hour. 



OMrtogramt. 



The use of statistical maps needs only a brief notice. Any 
numerical quality of a population, its density, average income, 
average taxation, may be shown district by district 
by suitable markings, or colours. Of these the 
most useful method is to choose one colour, say blue, for 
excess above the average ; another, say red, for defect. Divide 
the districts in nine groups, say more than 7 per cent, 5 to 7 
per cent, 3 to 5 per cent., i to 3 per cent above the average : 
these should be marked by four shades of blue, becoming lighter 
as the average is approached ; within i per cent of the average, 
above or below, should be white ; and shades of red, gradually 
becoming darker, will show the remaining grades below the 
average. Care must be taken not to adopt too many grades. 
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For examples of this method see Booth's Life and Labour of 
the PeopUy maps ; the Statistical Atlas of the Xlth Census of the 
United States ; the Statistical Atlas of India; and the maps in 
M. Levasseur's paper just mentioned. A cheap and very effective 
method, by which similar results are obtained in black and white 
only, may be seen on Plate P (misprinted 2) in that paper, and in 
the excellent chapter on Graphic Representation in Bertillon's 
Cours ilimentaire de Statistique^ P- ^33 ^^9- 
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2. Historical Diagrams. 

Perhaps the chief use of diagrams is to afford a rapid view 
of the relations between two series of events. 

The different cases that occur are best illustrated by examples. 
The simplest is when we wish to compare two sets of figures 
compariBon of expressed in the same unit, say £ sterling ; and 
figoxM the simplest of these when we wish simply to com- 
pare a whole and its parts. 

On the adjacent diagram the upper line shows the annual 
total gross revenue {Statistical Abstract^ p. 9) ; the next line, that 
uiiutnited 1^ P^^t which comes from inland revenue and customs, 
the nTenne. the difference being mainly composed of post office 
and telegraph receipts. The principal heads of revenue are 
customs, excise, income tax, and post office. These are shown 
by suitable lines for each year, each line being independent of 
the other, and all having the same base line and being on the 
same scale. This method is greatly preferable to the alternative 
one of drawing a second line representing the total less customs, 
a third the total less customs and excise, and so on, because the 
eye is then quite incapable of judging the relative movements of 
the separate items. The figure shows at once the main features of 
the course of revenue. The increase has been rapid but irregular. 
The growth in the Crimean War was too rapid to be at once 
maintained, but the figures for the 6o*s are at a far higher level 
than those for the 50's. A rapid fluctuation in 1870 is followed 
by a more regular growth almost unchecked till 1887 ; and then, 
after a short stationary period, there is a great increase in 1895. 
These remarks apply almost without alteration to the line show- 
ing inland revenue and customs. If we look for the parts of the 
revenue that have borne the increase and change, we see that in 
the whole period receipts from excise have increased most, next 
those from the income tax, and next those from the post office, 
while the customs have diminished. Each line has its distinc- 
tive features. The post office payments show an almost regular 
gfrowth. The income tax fluctuates violently, bearing the brunt 
of nearly all the rapid changes in the total, especially in 1856 
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Revenue of the United Kingdom. 

Unit, in all columns, ;f io,ocx>. 





Tou! 
Revenue. 


Inland 

Revenue 

and 


Customs. 


Excise. 


and Income 
Tax. 

560* 


Post and 
Telegraph. 






Customs. 








1850 


5.739 


5.431 


2,226 


1.497 


216 


185 1 


S.732 


5.412 


2,204 


1.528 


560* 


228 


1852 


5,658 


5.335 


2,222 


1.538 


550* 


237 


1853 


5.753 


5,401 


2,214 


1.575 


57°! 


237 


1854 


5,890 


5,502 


2,251 


'A 


580* 


252 


185s 


6,282 


5,944 


2,163 


1,070* 


Ul 


1856 


7,026 


6,601 


2,324 


1,730* 


1,520* 


1857 


7,279 


6,848 


2,353 


1,840* 
1,782 


1,620* 


292 


1858 


6.788 


6,309 


2,3" 


'»35? 


292 


1859 


6,548 


5,987 


2,412 


1,790 


668 


320 


i860 


7,109 


6,570 


2.446 


2,036 


960 


331 


I86I 


7.028 


6,514 


2,331 


1,943 


1,092 


340 


1862 


6,986 


6,412 


2,367 


1,833 


1,036 


351 


1863 


7,060 


6.390 


2,403 


1,715 


1,057 


3^5 


1864 


7,021 


6,306 


2,323 


1,821 


908 


381 


1865 


7,031 


6,291 


2,257 
2,128 


1,956 


796 


410 


1866 


6,781 


6,036 


1,979 


639 


425 


1867 


6,943 


6,156 


2,230 


• 2,067 


S70 


447 


1868 


6,960 


6,204 


2,265 


2,016 


618 


463 


1869 


7,259 


6.422 


2,242 


2,046 


862 


466 


1870 


7,543 


6,708 


2,153 


2,176 


1,004 


477 


I87I 
1872 


6.994 
7,471 


6,106 
6.484 


2,019 
2,033 


2.279 
2,333 
2,578 


gi 


527 


1873 


7,661 


6,660 


2,103 


7SO 


583 


1874 


7.734 


6,608 


2,034 


2.717 


569 


700 


1875 


7,492 


6,397 


1,929 


2,739 


431 


679 


1876 


7,713 


6,525 


2,002 


2.763 


411 


719 


1877 


7,857 


6,636 


1,992 


2,774 


s? 


730 


1878 


7,774 


6,610 


1,997 


2.746 


K 


746 


1879 


8,115 


6.899 


2,032 


2,740 


757 


1880 


7,934 


6.695 


1.933 
1,918 


2,530 


923 


777 


I88I 


8,187 


6,895 
7,058 


2.530 


1,065 


Ir 


1882 


8,396 


1,929 


2.724 


994 


863 


1883 


8,739 


7,313 


1,966 


2.693 


1,190 


901 


1884 


8.616 


7,187 


1,970 


2.695 


1,072 


^l 


1885 


8,799 


7,380 


2,032 
^983 


2,660 


1,200 


^ 


1886 


8,958 


7,493 


2,546 


1,516 


989 


1887 


9.077 


7,611 


2,015 


2,525 


1,590 


1,028 


1888 




7,566 


1,963 


2,562 


1,444 


1,0^ 


1889 


8!847 


7,360 


2,007 


2.560 


1,270 


1,118 


1890 


8,930 


7,341 


2,042 


2.416 


1,277 


1,177 


I89I 


8,949 


7,358 


1,948 


2.479 


1.325 


1,226 


1892 
1893 


9,099 
9,040 


7,534 
7,480 


1,974 
1,971 


2,561 
2,536 


1,381 
1.347 


1,263 
1.288 


1894 


9,113 


l^l 


1.971 


2,520 


1,520 


1,301 


1895 


9.468 


2,011 


2.605 


1,560 


1,334 


1896 


10,197 


8.512 


2,076 


2,680 


1,610 


1.422 


1897 


10,395 


8,597 
8,855 


2,125 


2,746 


1,665 


1,477 


1898 


10,661 


2.180 


2,830 


' 1,725 


''5l! 


1899 


10,834 


8,945 


2,085 


2,920 


1,800 


1,586 



* These figures cannot be given accurately within ;Cioo,ooo. 
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and 1870. The fall in this line in 1872-76 is counterbalanced by 
the rise in excise ; while the excise line shows stationariness till 
1870, a sudden jump to 1874, and a very slow decline since that* 
date. Customs, on the other hand, have to some extent taken an 
opposite course to that of excise, so that the total from the two 
has not changed very rapidly. There is a very marked station- 
ariness since 1 871. At the top of the page a new base line is 
taken, and the number of pounds per head of the population is 
shown year by year; it will be seen that the only important 
increase was between 1851 and 1857, and that since i860 the 
fluctuations have been slight 

So far we have found no more difficulty in the choice of 
scales than previously when dealing with only one line, for all 
ohoioeor the lines on the larger diagram indicate millions 
Moond Bcaie. of pounds, and when the unit is ;^i, a new base 
line has been adopted. But we may need to show the change 
of population on the larger diagram. It is necessary, as 
we have already seen, to use the same base line for the two 
quantities to be compared ; but we niay choose any point for* 
the beginning of the new line, adapting our vertical scale, for the 
eye can judge the proportionate changes wherever the line is 
placed. It is best to decide this point by defining the problem 
on which the comparison should throw light. If it is required to 
compare the growth of revenue with the growth of population 
since, say, 1850, we should start the new line at the point on 
the 1850 line where the revenue curve begins, and we can then 
see how the lines intersect one another again and again. Since 
1850, however, is an arbitrary date, this plan lacks definition, 
and it is more logical to make the lines coincide at the most 
recent date given, with which any previous date can then be 
compared. The plan adopted on the diagram given is another 
alternative ; the line is drawn on such a scale that it lies fairly 
close to that for inland revenue throughout the greater part of 
its course. . <»^ 

The next diagram, facing p. 164, introduces further diffi- 
culties as to the choice of scales. The object of the figure is to 
oompariion of show the relations between quantity, value, and 
quantity and price of imported wheat, and population. The line 
^^'**' A is first drawn on a scale chosen so as to throw its 
fluctuations into relief. Population is at once brought into rela- 
tion with this by calculating the amount per head year by year. 
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The line C to represent these figures is drawn on a different 
scale, chosen so that the line shall not cause confusion by con- 
tinually crossing any of the others on the figure. If the figure 
was too full this could be treated as on p. i6i, the revenue per 
head. The same scale of years must be used, and for simplicity 
of calculation and appearance, lOO lbs. consumed per head is 

DttteUa of measured by the same vertical distance as 10,000,000 
**""*™**'*^ cwt imported. A and C refer to the same quan- 
tities, and therefore similar lines are used in both cases. The 
line B represents value and is shown by a broken line. For 
this line the choice of scale is more difficult In the diagrams 
which follow, instances will be shown where special methods are 
used to bring out specific comparisons. Here this is not neces- 
sary, and a scale is adopted which brings the lines A and B into 
near relation, and shows the fluctuations of B, while the figure is 
made simple and intelligible by the representation of ;^20 by the 
same vertical distance as 20 cwt. 

The line D shows the changing price of wheat The scale is 
chosen so that it boldly crosses the lines A and B ; thus its 
fluctuations are clearly shown, and the numbers are easily seen, 
for 2s. per cwt. is represented by the same vertical line as 
io,opo,ooo cwt If the figure was accurately drawn, lines A and 
D would lie one over the other in 1876-77.; they are therefore 
shifted very slightly horizontally, and clearness is preserved 
without the general impression being vitiated. 

This line B shows some very interesting facts. Its chief 

characteristic is excessive fluctuation ; while a smoothed line 

mstoiioaiftoii would show an upward tendency till 1878 and a 

uinitratad. f^n since that date. The fluctuations are the result 
of a great number of causes : an increasing population, the 
fact that wheat imported is only complementary to the home 
product, which is dominated by the English weather, the varia-* 
tion of harvests all over the world, political events, the fall in the 
value of silver, the development of means of communication and 
transport, and all the other causes which determine price. Notice 
how all these are indicated by this single line. The upward 
tendency till 1875 shows an increasing population; a deficient 
home harvest is shown by the rise in 1871-73, a world-wide defi- 
ciency by the fall in 1880, a good home product by the fall in 
1875, The American Civil War is marked in 1865, the general 
improvement in transport by the rise before 1875, the fall of 



Digitized by VjOOQIC 



HISTORICAL DIAGRAMS. 



163 



prices by the fall since 1878. These various causes, however, 
often tend to neutralise one another. 

Importations of Wheat and Wheat Flour, 1862 to 1898. 





A. 


a 


c. 


D. 




Total 


Toul V.lw 


Quantity re- 


Average Price of 


Year. 


Qoantities 
Imported. 

Unit, 
100,000 cwt. 


linpoRcd. 
Dnit, 


tained per 
HeadoAhe 


Wheat and 
Wheat Flour in 




jClOOfOOO. 


Population. 


Shillings per cwt. 








lbs. 




1862 


500 


286 


18S 


11.44 


1863 


32? 


155 


112 


10.03 


1864 


288 


135 


104 


9-37 


1865 


258 


^^ 


93 


9.61 


1866 


294 


168 


104 


"•43 


1867 


391 


285 


140 


14.58 


1868 


365 


249 


130 


13.64 


1869 


444 


233 


156 


10.50 


1870 


369 


196 


123 


10.62 


1871 


444 


268 


151 


12.07 


1872 


476 


303 


163 


12.73 


1873 


516 


344 


171 


13.33 


1874 


493 


309 


162 


12.53 
10.89 


1875 


595 


324 


197 


1876 


519 


279 


168 


10.75 


1877 


63s 


407 


202 


12.82 


1878 


597 


342 


'H 


11.46 


1879 


^5 


400 


228 


10.95 


1880 


393 


209 


11.47 


1881 


U 


407 


217 


11.42 


1882 


449 


242 


II. 11 


1883 


851 


438 


252 


10.30 


1884 


669 


301 


192 


9.00 


1885 


823 


337 


!i8 


8.1a 


1886 


670 


261 


7.79 


1887 


802 


3H 


224 


7.82 


1888 


804 


315 


223 


Hi 


1889 


789 


3" 


219 


• 7.88 


1890 


824 


327 


226 


7.94 


1891 


89s 


396 


244 


8.85 


1892 


956 


371 


^ 


7.76 


1893 


938 


3°! 


6.57 


1894 


967 


268 


t 


5.54 


1895 


1.073 


302 


5.63 


1896 




309 


257 


6.21 


1897 


887 


330 


228 


7.44 


1898 


944 


377 


238 


7.99 



As regards the choice of markings for different lines, the 
chief rule is that lines which cross one another, unless very 
ohoiM of acutely, must be marked differently. The second 
mtfkingi. rule is to mark similar quantities in similar ways. 
Thus in the next diagram the lines representing quantities 
have a resemblance to one another, as have also those showing 
values ; while the two lines relating to imports are distinguished 
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The CoiTON Trade, 1854-98. 






Piece Goods Exported. 


Raw Cotton Imported. 




Year. 


Quantity. 


Value. 


Quantity, 
ooo's omitted. 


Value. 


Price per 
Cwt. 




000,000 s 
omitted. 


ooo's omitted. 


ooo's omitted. 






Yards. 


£, 


cwts. 


I 


jC 


1854 


1,693 


25.055 


7,923 


20,175 


2.55 


1855 


1.938 


27,579 


7.962 


20,849 


2.62 


1856 


2,035 


30,204 


9.142 


26,448 


2.89 


1857 


1,979 


30,323 


8,655 


29.289 


3.38 


1858 


2,324 


33,422 
38,744 


9,235 


30,107 


3.26 


1859 


2,563 


10,946 


34,560 


3- it 


i860 


2,776 


42,142 


12,419 


38!653 


2.88 


1861 


2,563 


37,580 


11,223 


3.44 


1862 


1,681 


28,562 


4,678 


31.093 


6.65 


1863 


1,711 


37,634 


5,983 


56,282 


9.41 


1864 


1.752 


43,917 


7,983 


78,219 


9.90 


1865 


2,014 


44.876 


8,737 


66,041 


7.56 


1866 


2,576 
2,832 


57.903 


12,299 


77,530 


6.30 


1867 


53,128 


11,276 


52,003 


4.61 


1868 


2,977 


50,265 


11,864 


55,194 


4.65 


1869 


2,869 


49,922 


11,907 


56,847 


4.77 


1870 


3,267 


53,348 


11.95? 
15.876 


53,478 


4.47 


1871 


3,417 


53,643 
58,931 


55,907 


3.52 


1872 


3,538 


12,579 


53,381 


4.24 


1873 


3,484 


56,493 


13,639 


54,705 


4.01 


1874 


3,607 


55,023 


13,990 


50,696 


3.62 


1875 




53,627 


13,325 


46,260 


3.46 


1876 


3,669 


50,378 


13,284 


40,181 


3.03 


1877 


3,838 


52,442 


12,101 


35,421 


2.93 


1878 


3,619 


48,104 


11,968 


33,520 


2.80 


1879 


3,725 


46,875 
57,678 


13,119 


36,181 


2.76 


1880 


4,496 


14,542 


42,772 


2.94 


1881 


4,777 


59,104 


14,992 


43,835 


2.92 


1882 


4,349 


55,443 


15,930 


46,655 


2.93 


1883 


4,539 


55,534 


15,485 
15,618 


45,042 


2.91 


1884 


4,417 


51,666 


44,486 


2.85 


188s 


4,375 
4,850 


48,277 


12,731 


'^i'^^i 


2.86 


1886 


50,172 


15,313 


38,128 


2.49 


1887 


4,904 


51,742 


15,995 


40,156 


2.51 


1888 


5,038 


52,582 


15,462 


40,009 


2.59 


1889 


5,001 


51,388 


17,299 


45,642 


2.64 


1890 


5,125 


54,160 


16,013 


42,757 


2.67 


1891 


4,912 


52,432 


17,811 


46,081 


2.59 


1892 


4,873 


48,766 


15,850 


37,888 


2.39 


1893 


4,652 


47,282 


12,650 


30,685 


2.43 


1894 


5,312 


50,219 


' 5*965 


32,944 


2.06 


1895 


5,033 


46,759 


15,688 


30,429 


1.94 


1896 


5,218 


51,196 


15,669 


36,272 


2.31 


1897 


4,792 


45,808 


15,394 


32,195 


2.09 


1898 


5,216 


47,910 


19,005 


34,126 


1.80 
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from those relating to exports. If it is possible to use more 
than one colour this principle can be easily carried out* 

This diagram is intended to show the relations between the 
quantities and values of cotton imported and exported during 
The history of forty years. The vertical scale for values is chosen 
the cotton trade. gQ ^g ^q bring the whole figure to a convenient size 
and to mark the fluctuations. The value of the raw cotton im- 
ported is increased, perhaps trebled by manufacture, and of the 
finished product a large part is used at home, the rest exported. 
The excess of the value of exports over imports therefore re- 
presents the increment of value due to manufacture (that is the 
total earnings or the wages and profits of the cotton industry), 
less the total value of all cotton goods sold at home. When the 
exports are less in value than the imports, the earnings of manu- 
facture are less than the home consumption : when equal, equal. 

Looking at the diagram, it will be seen that value of exports 
of piece goods exceeded that of general imports from 1854 to 
i860, though often by only a small margin; that the reverse 
was the case during the cotton famine in 1861-66, when extrava- 
gant prices were paid for raw cotton to partially supply the 
home market at a high price, while the export fell off. Equality 
was again attained in 1867, while since 1871 exports have 
greatly exceeded imports in value, the difference being perma- 
nently established since 1879. It would appear that the home 
market is saturated, while the foreign market has extended. 

The line representing the value exported may be described 
in a few words : a general and rapid increase took place from 
1850 to 1866, interrupted only by the Civil War ; since 1866 the 
fluctuations have been violent, but the general average stationary. 
The effect of the Civil War is well emphasised by all the lines 
here, and is clear also in the diagram facing p. 164. With a 
little experience in the use of diagrams these lines may be 
smoothed by the eye alone. 

The unit of the quantity of imports is 1,000,000 cwt. of raw 

cotton ; one-tenth of this can be distinguished in the figure. In 

ohoioeof soaie 1 854, 7,993,000 cwt. were imported, and their value 

for quantities. . vj^as ;£"20,i7S,ooo. If we represent 2 cwt. by the 

same vertical length as £Sy ^^ done in the figure, the lines begin 

♦ See IVa^es in the Nineteenth Century^ by the present author, diagram 
p. 90. 
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at practically the same point. Adopting this scale, we are able 
to see at once the divergence of quantity from value during the 
period. 

In the year 1891, 17,81 1,000, cwt. valued at ;f 46,08 1,000, were 
imported. The sum would have bought 18,096,000 cwt. at the 
price of 1854, a difference of only ij per cent; so that it hap- 
pens in this case that the value and quantity lines are nearly 
together again in 1891. The actual course of prices is shown 
by the lowest line on the diagram. In 1862 quantity falls more 
quickly than yalue as price rises, and as the supply recovered 
in 1866 value went up before and more violently than quantity, 
owing to the high price. In 1869 quantity rose while value fell, 
but otherwise the lines fluctuate together and continually tend 
towards each other. 

The study of quantity and value in exports is more inter- 
esting. It is not obvious what commodity is the best repre- 

Hiitoryof sentative of cotton exports. In 1895, 5,000,000,000 

export!. yards of piece goods valued at ;^46,700,ooo, 8 1 2,000 
pairs of stockings valued at ;£'220,ooo, 23,800,000 lbs. of sewing 
thread valued at ;^3,ooo,ooo, and 250,000,000 lbs. of cotton yam 
valued at ;£^9,200,ooo, were exported. A good plan, perhaps, 
would be to take so many yards of piece goods as equivalent to 
so many pounds of yarn, the relative prices being the criterion, 
and to add these together to determine the quantity; in the 
figure, however, piece goods only are taken. 

In this case there is no simple relation between quantity and 
value at the first date, and there is no simple method of making 
the two scales correspond. Having marked the value line on the 
squared paper in use, it was necessary to draw out a new system 
for the quantities. In 1854, 1,693,000,000 yards were valued at 

^25,055,000. Then ^^- yards corresponded to £1, t\e.y 67.7 
25 

yards to a unit ; and each number of yards had to be reduced to 
this scale. This is done in practice quite easily by a mechanical 
scale, by which numbers can' be automatically reduced in any 
required ratio. The scale is then entered to the right-hand 
side of the figure. It is of course not easy to read the exact 
numbers off the figure, but it can be done with the help of 
a ruler. To avoid this difficulty, the actual amounts can be 
entered on the diagram at critical places. But after all it is 
not the object of the diagram to make it possible to read the 
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numbers ; the object is to show the relative rises and falls, and 
the steepness, and to allow comparisons of the lines. The figures 
should be taken from a table. No scale is in reality necessary, 
except for the process of drawing the lines. 

The history of quantity of exports of cotton (and of other 

textiles) is quite different from that of values. Value fluctuates 

Quantity ud and shows very little rise since 1866. Quantity 

"^^ fluctuates, but not greatly, except at the Civil War, 

and except in the '6o's, and in '80-6 shows a general rise. The 

smoothed curve would rise throughout. 

One important cause of this difference is, that, as Sir R. Giffen 
has pointed out, a large sum should be in reality deducted from ex- 
port values to allow for the import value of the raw cotton, before 
any conclusions are drawn as to the progress of British manu- 
facture. Now, as we have seen, the price of raw cotton has 
fallen very fast during precisely the period (since 1865) that 
export value has not grown. A greater corresponding deduction 
should be made in the earlier years than in the later, which would 
result in a definite, if fluctuating, rise in the period. This would 
not make values increase so fast as quantities; the difference is 
due to the general causes of the fall of price of manufactured goods. 
By looking carefully at the diagram it will be seen that the quantity 
line approached that of value, when the price was falling in 1866-68, 
and fell away again with the higher price of 1872 ; after 1872 
the quantity line gets nearer again, and crosses the value line in 
1875, when the price was the same as in 1850; since 1875, as 
prices fell, the divergence has steadily increased. 

It must be admitted that a study of the diagram repre- 
senting these figrures leads much more rapidly and safely to 
many interesting conclusions than the table on p. 164 of the 
figures themselves. 
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3. Comparisons of Series of Figures. 

A. Before proceeding to the study of the next diagram, it will 
be well to define more exactly what is our object in comparative 
studies of figures, and to consider the means at our disposal. 

When dealing with two series of similar quantities such as 
the course of trade or population in two countries, we wish to 

QuBsitain see the general rate of progress (to be done by 

oompaxiioBi. smoothing the curve), the years of special increase, 
the dates of maximum and minimum, in fact to compare the three 
things that the eye can see — the increase, the rate of increase, 
and the dates of change of rate of increase. The most obvious 
way to do this is, to take the same scale and base line for both 
countries and the same unit of measurement ; but this method 
does not take us all the way. We can judge differences, it is 
true, and the additions in all the years in both countries, and we 
can see the highest and lowest points and dates of change of 
rate of increase ; but we cannot compare rates of increase. 
It is not easy to judge ratio, though a rough guess at it is 
possible. Thus if the trade is very different in magnitude in 
the two countries, equal absolute increments will mean very 
different relative increments, and it is difficult to be always on 
one's guard. 

The remedy for this is to alter the arrangement of scales. 

Make a second figure, in which the unit shall be not a sum of 

Percentage money, but a percentage : let i per cent, of Eng- 

soaies. land's trade, say in 1850, be the unit for the 
English line ; and i per cent of the trade of Germany, at the same 
date, for the German line. In other words, express the trade of 
both countries as percentages of their value in a given year, 
and draw lines to represent these percentages. Alongside the 
diagram two or more scales can be placed showing the absolute 
amounts of the trade of each country. Then the ra^tes of 
increase will be comparable, equal increments representing equal 
percentages of the trade of each country ; and, in addition, the 
dates at which either country gained ground relatively to the 
other can be easily picked out. The question as to whether 
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absolute rates or relative rates should be studied is a very com- 
mon one in statistics. Sometimes the absolute magnitude 
.^ , ^ should be known, as for instance when we want to 

Absolute or , «« r 1 • 1 -n rr 

relative estimate the effect of measures which will affect 
prosress. ^j^^ well-being of special classes, or the trade of 
special countries ; sometimes the relative rate, as when we want 
to watch the progressive increase of different industries, or to be 
on our guard as to future competitors. The two studies gene- 
rally require two different diagrams though they may represent 
the same numbers. 

It will be seen that the chief diflficulty lies in the choice of 
the year in which the quantities are to be equated ; this must 
be decided by the nature of the argument which the diagram 
is to illustrate. 



We may compare the following figures — 



Year 


1880 


1890 


1900 


A - 


220 


440 


330 


B - 


160 


240 


400 



in three ways, thus : — 

I. Expressed as percentages of values 
in 1880. 



2. Expressed as percentages of values 
in 1890. 



Scales 
7« A. 


B. 


200 440 


320 


150 330 


240 


100 220 


160 



50 no 80 




1880 



Scales 
7o A. B. 



150 660 360 



100 440 240 



50 220 120 



1890 



1900 




1880 1890 
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3. Expressed as percentages 
of values in 1900. 

Scales 
7o A. B. 




1880 1890 1900 

In figure 3 the fluctuations are seen as percentages of the 
values at the last date, and are thrown into better proportion 
than in figure i. It is frequently the case that the equating of 
'quantities at the most recent date throws what are often small 
beginnings into their right proportion when viewed from the 
modern standpoint. The statements that the values in 1880 
were 60 and 6y per cent, respectively of the corresponding 
present values, is in better perspective than the statement that 
the values in 1900 were 250 per cent, and 150 per cent, of the 
corresponding values in 1880; but circumstances must decide 
in each case which method is to be adopted. 

These points are fully illustrated by the annexed diagrams, 
the object of which is to analyse the progress of our trade with 

Illustration °"^ colonies and with foreign countries, especially 
flpom trade with Germany. The first figure shows the total im- 

oermany. ports and exports, and the parts of each which 
are colonial and foreign, the scale in millions of pounds being 
the same for all the lines. A line is also given for imports from 
Germany, Holland, and Belgium ; these are grouped together, 
because it is not possible to distinguish in the returns from the 
two latter home manufactures from German goods in transit. 
It is not clear from this diagram which part of our imports has 
increased most rapidly. The three lines are, therefore, redrawn 
in the second diagram, on a percentage scale, all the values 
being expressed as percentages of the corresponding values in 
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1898. It is now seen that imports from foreign countries 
and from our colonial possessions and India have marched 
together except during the period of the cotton famine, but the 

Imports and Exports, 1862- 1898. 
Unit in all columns, ;f 100,000. 

















Imports 






Toul 




Exports 


Imports 


Imports 


from 




Total 


Exports 


to 


to 


from 




Germany, 




Imports. 


including 


British 


Forei^ 


British 


Foreiini 


Holland 






Re*expoit8. 


Possessions. 


Countries. 


Possessions. 


Countries. 


and 


1862 


2,257 
2,489 












Belgium. 


1,662 


454 


1,207 


653 


1,604 


279 


'^J 


1,969 


550 


1,419 


847 


1,642 


283 


1864 


2,749 


2,126 


557 


1,569 


937 


1,812 


332 


1865 


2,711 


2,188 


515 


1,673 


728 


1,982 


364 


1866 


2,953 


2,389 


572 


1,817 


722 


2,231 


388 


'^l 


2,752 


2,258 


534 


1,724 


607 


2,144 


373 


1868 


2,947 


2,278 


537 


1,741 


670 


2,277 


379 


•1869 


2,955 


2,370 


519 


l:ii; 


7°i 


2.250 


405 


1870 


3i033 


2,441 


554 


648 


2,384 


409 


187 1 


3.310 


2,836 


556 


2,280 


729 


2,581 


' 469 


1872 


3,547 


3.146 


656 


2,490 


794 


2,753 


455 


1873 


3»7I3 


3,"o 


711 


2,399 


810 


2,903 


463 


1874 


3JOI 


2,977 


779 


2,197 


822 


2,879 


494 


1875 


3»739 


2,816 


767 


2,050 


844 


2,895 
2,908 


. 515 


1876 


3,752 


2,568 


701 . 


1.866 


^3 


516 


1877 


3,944 


2,523 


758 


1,766 


896 


3,049 


590 • 


1878 


3.688 


2,455 


720 


1,735 


779 


2,908 


575 


1879 


3,630 


665 


1,823 


789 


2,840 


543 


1880 


4,112 


2,864 


815 


2,049 


925 


3,187 


616 


I88I 


3.970 


2,971 


867 


2,104 


915 


3,055 
3,136 
3,282 


5^5 


1882 


4,130 


3,067 


923 


2,143 


^ 


658 


1883 


4.269 


3.054 


904 


2,150 


692 


1884 


3.900 


2,960 


883 


2,077 


958 


2,942 


646 


1885 


3.710 


2.715 


885 


1,860 


844 


2,866 


638 


1886 


3.499 


2,690 


822 


1.867 


819 


2,680 


609 


1887 


3,622 


2,813 


823 


1,990 


838 


2,784 


646 


1888 


3,876 


2,986 


917 


2,068 


869 


3,007 


684 


1889 


4,276 


3,156 


908 


2.248 


973 


3,304 


715 


1890 


4,207 


3,283 


945 


2,337 


962 


3,245 


694 


I89I 


4,354 


3.091 


933 


2,158 


^5 
978 


3.360 


716 


1892 


4,238 


2,916 


812 


2,104 


3,260 


715 


1893 


4,047 


2,771 


786 


1,986 


918 


3,129 


720 


1894 


4,083 


2,738 


786 


1,952 


939 


3,144 


716 


■1 


4,167 


2,858 


761 


2,098 


955 


3,211 


729 ' 


4,418 


2,964 


907 


2,057 


932 


3,486 


761 


1897 


4,510 


2,941 


870 


2,072 


940 


3.570 


760 


1898 


4,704 


2,940 


901 


2,039 


994 


3,709 


785 



trade from Germany has increased more rapidly than either. If 
we had equated the quantities in 1862, the German line would 
have faroutpassed the others by 1898 ; but the impression given 
would be erroneous as regards absolute quantities, for the 
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increase was only ;£"50,6cx),ooo for the one, while it was 
;^ 1 10,500,000 for the other. The remaining diagram shows the 
relative rates of increase for Germany, Holland and Belgium, 
and the British possessions respectively, since 1870. 

B. Series of figures are often compared graphically with a 

view to discovering or illustrating causal relations. In such 

oaniai cases we do not only study relative growth as 

Miauonfl. jn the last diagram discussed, but look throughout 
the period for any signs of resemblance in rates of growth, dates 
of maxima and minima, or synchronism in any changes. The 
methods by which such comparisons are made are difficult, and 
need careful analysis. For instance, we may wish to show that 
an increase of the allowance for outdoor relief is connected with 
an increase of pauperism. In this case one line will represent 
money, the other the number of persons, and there is no common 
unit ; we need not calculate percentages, but having chosen any 
scale for money, we can make equality in any year by a simple 
adaptation of the scale for number. We shall wish to establish 
first, that an increase or decrease of money occurred at, or just 
before, an increase or decrease in number ; and secondly, that 
the greater the increase of one the greater the increase of the 
Qther. In order to show direct connection, we shall try to make 
one line lie as nearly as possible over the other. 

Draw a preliminary diagram in which both lines are entered 
on any scales ; this will suggest the resemblances to be tested, 
ti Notice in what period the fluctuations are greatest ; 
this in general should be the period to be taken, 
for it is here that the causal relations have had most play. 
If any other period is chosen for any special reasons, these 
should be made clear, for otherwise a critic may legitimately 
object that it is only in this period that the connection is 
distinct. There would be little difficulty in finding short 
periods in any two curves where the fluctuations synchronised. 
Take the averages of both money and of number over the 
period chosen, and draw a second diagram in which the scale 
for number is chosen by making this average for number equal 
to the cprresponding average for money. Any correspondence 
between the two lines can be at once detected. 

The process just described is completely carried out in the 
first two diagrams comparing the marriage rate and foreign trade 
facing p. 175. 
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COMPARISONS OF SERIES OF FIGURES. 173 

There are many cases when the changes in the magnitudes 
which we regard as the causes are inversely proportional*— in 

inyena the opposite sense — to those in the magnitudes 

reiattoss. which we regard as the effects. For instance, if 
we are comparing trade improvement with the number of 
unemployed, and make the construction just described, the 
maxima of the first line would synchronise with the minima 
of the second. Greater clearness can be obtained by inverting 
one of the diagrams, plotting out the number employed instead 
of that unemployed, and then the changes should be in the 
same sense in both lines. 

In the above construction the lines will only lie one over the 

other throughout their fluctuations, if the changes in one quantity 

Moreoomplez are in Strict proportion to the changes in the other, 

reiatiou. jf ^„ increase of 10 per cent, for instance, in the 
allowance for outdoor relief corresponded to one of 10 per cent, 
in the number of paupers. It is very rare that such a simple 
relation is found ; all we can see in general is that the maxima 
and minima occur at the same dates, that the fluctuations agree 
throughout in sense in both series, and that the greater fluctua-' 
tions in the one correspond to the greater fluctuations in the 
other. 

Diagrams may often be used to suggest correlation between 

two series of figures, and this indeed is one of their chief merits, 

uaeof and they may be used to illustrate arguments on 

diagrams. ^^ subject, but at this point their utility ends, for 
they cannot be made to prove much. Causal relations are very 
difficult to establish, and the original figures must be critically 
consulted when theories are to be brought to the test. 

We have not yet exhausted the power of diagrams for 

making such comparisons, but the following method must be 

noreezaot applied only with great caution. Suppose that 

method. ^g ^jgj^ ^Q establish that an increase of i bushel ^ 
in the quantity of wheat to be bought for a sovereign corresponds ' 
to an increase of 1.5 in the marriage rate per 1,000, or any ' 
such strict numerical proportion. Draw a diagram representing 
the quantities of wheat, take the average for the period chosen 
for comparison, and write the scale so as to read i, 2, 3 . . . 
bushels above or below the average. Draw no base line. Now 
enter a line to represent the excess or defect of the marriage 
rate from its average in the chosen period, on a scale such that 



Digitized by 



Google 



174 



ELEMENTS OF STATISTICS. 



1.5 in excess is represented by the same vertical distance as 
I bushel. The closeness of the two lines indicates the validity 
of the theory. The danger of this method is, that with no base 
line there is no possibility of judging the amounts of the changes 
relative to the totals. The insertion of the necessary two base 
lines would confuse rather than aid. 



MarriaCe Rate, Total Exports and Imports per Head of Popu- 
lation, AND Average Price of Wheat per Quarter. 



Year. 


Marriage 
Rate. 


Total Exports 
and Imports 


Average Price 
of Wheat 






per Head. 


per Quarter. 






£ s. <L 


*. d. 


i860 


17. 1 


13 8 


S3 3 


1861 


16.3 


13 3 


55 4 


1862 


16. 1 


13 8 


55 5 


1863 


16.8 


15 2 7 


44 9 


1864 


17.2 


16 8 7 


40 2 


1865 


17.5 


16 7 5 


41 10 


1866 


^7-5 


17 14 5 


49 II 


1867 


16.S 


16 9 6 


64 5 


1868 


16.1 


17 6 


63 9 


1869 


15.9 


17 3 9 


48 2 


1870 


16. 1 


17 10 3 


46 10 


1871 


16.7 


19 9 6 


56 8 


1872 


17.4 


21 


57 


1873 


17.6 


21 4 2 


58 8 


1874 


17.0 


20 II 


55 8 


1875 


16.7 


19 19 4 


46 2 


1876 


16.5 


19 10 


1877 


• 15.7 


19 5 5 


56 9 


1878 


15.2 


18 2 I 


46 5 


^ll^ 


14.4 


17 16 10 


43 10 


1880 


14.9 


20 3 3 


44 4 


1881 


15- 1 


19 17 5 


45 4 


1882 


15-5 


20 8 10 


45 I 


1883 


^5'S 


20 13 2 


41 7 


'!^ 


15. 1 


19 4 I 


35 8 


'!!l 


14.5 


17 16 9 


32 10 


1886 


14.2 


17 10 


31 


1887 


14.4 


18 II 7 


32 6 


1888 


14.4 


18 12 I 


31 10 


1889 


iS-o 


19 19 9 


29 9 


1890 


15.5 


19 19 7 


31 II 


1891 


15.6 


19 14 


37 


1892 


15.4 


18 15 6 


30 3 


1893 


14.7 


17 14 9 


26 4 


1894 


15. 1 


17 II 9 


22 10 


1895 


15.0 


17 19 3 


23 I 


1896 


15.8 


18 14 I 


26 2 
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COMPARISONS OF SERIES OF FIGURES. I75 

It is clear from the preceding analysis that, by the choice 
of scales and base lines, the points at any two dates may be 
made to coincide on any number of accurately drawn lines 
representing series of figures. 

The preceding paragraphs are completely illustrated by the 
adjoining diagram. 

On the left are given lines representing the price of wheat in 

shillings per quarter, the total of values of exports and imports 

mnstntionor divided by the population, and tSe marriage rate 

metiiod. per 1,000. The scales chosen are simply those 
which are easiest to use, and throw the lines into proper relief. 
The points in each scale for the same years are over one another, 
but the base lines and scales differ. 

We can see at a glance whether there is resemblance between 

the courses of these figures. There is at any rate a general 

Mantegsrate correspondence between the fluctuations of trade 

and trade, and of the marriage rate since 1870, and possibly 
earlier. There are points of likeness between wheat prices 
and trade; in 1870-73 both rise together, and fall in 1873-75; 
both rise in 1876-77, fall in the following two years, and then 
rise again ; both fall from 1881 to 1886 and then rise. There 
are also many cases in which the motions do not agree, especially 
1862-64, and 1887-89. 

If we look now at the price of wheat and the marriage rate, 

which in the earlier part of the century used to be closely 

Harriagerate related, the one rising when the other fell, we see 

and wheat. ^jj^^ there is no great resemblance either in this 
or the contrary sense. In 1860-62 and in 1862-64 wheat rose 
and fell, while the marriage rate fell and rose ; wheat rose in 
1865-67, while the marriage rate was first stationary and then 
fell a little ; then it continued to fall in 1868-70, though wheat was 
falling also ; in 1870-80 the marriage rate shows one long, wheat 
two short, fluctuations. Since 1880, in years in which wheat 
fell, the marriage rate in general fell also and vice versa. 

Let us consider for a moment the possible links of connec- 
tion between these phenomena. When wheat was the chief 

oonneotiiig object of expenditure of the working class, its 

^*^**»- price was the chief thing for them to consider; 

and so when wheat rose the marriage rate fell. On the other 

hand, now that wheat is cheap and wages higher, a change in 

the price of the loaf is only of great importance to a minority ; 
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it is now the general prosperity of the country, well indicated by 
the condition of foreign trade, that raises the marriage rate. 

When exports and imports are increasing in value, trade is 
stimulated, and in spite of rising prices, marriageable people are 
sanguine that the prosperity will remain and the prices fall ; but 
when the prices fall, so do the profits and incomes, and marriage- 
able people are more prudent For these reasons we may expect 
the marriage rate and foreign trade lines to resemble each other. 

Now the increase of the marriage rate corresponding to an 
inflation of trade, and an inflation of trade to a time of rising 
prices in general, we shall find the price of wheat in particular, 
which is connected with the course of prices in general, rising 
when trade is inflated and falling when it is depressed, and 
therefore rising and falling with the marriage rate. But since 
the price of wheat is influenced also by special causes, it will not 
always correspond to the state of trade, and still less to the 
marriage rate, with its former tendency to opposite variations. 

There is no need then for surprise that the curves marriage 
rate and trade correspond ; that wheat and trade correspond, 
but less closely ; and that wheat and marriage show a double 
tendency. The correspondence between marriage and trade is 
investigated on the diagram. That between wheat arid trade 
should be done on an identical method. Marriage and wheat 
should be compared twice on different plans: first for direct 
correspondence, and then by redrawing the wheat curve with its 
base line at the top for inverse correspondence. 

To effect the comparison between the course of trade and 

the marriage rate, the following steps are taken. On examining 

Construction of the two curves on the first figure, it is seen that 

**i*8r«n- the resemblance does not begin before 1869 ; 
the parts of the curves since 1869 should therefore be brought 
into close correspondence. The average marriage rate, 1869-94, 
is 15.5, and average imports and exports per head, £ig. The 
marriage curve is drawn in the ordinary way ; then with the 
help of a sliding scale the trade curve is put in, so that with 
the same base line £ig falls on the 15.5 line. 

The result is that the curves are seen to rise and fall at the 
same dates, but not to the same extent; for, while the lines 
keep nearly parallel from 1873 to 1879, the falls from the 
maximum being equal, after 1879 the trade line fluctuates further 
above and below its average than the marriage rate does. 
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It remains to test graphically \Vhether the fluctuations are 
Piau proportional to one another. The average fluctua- 
oompMriion. tions in the two lines must now be equated. 



1869 

1873 
1879 
1882-3 
1886 
189 1 

1893 



Marriage Rate. 

Maxima. Minima. Differences. 

159 



17.6 

15-5 
is'6 



14.4 
14.2 
14.7 



1.7 
3.2 
I.I 

fi-3 
1.4 

.9 



Average of differences - i.JiJj^ 



Imports and Exports per Head. 



1867 

1873 
1879 
1883 
1886 
1889 
1894 



Maxima. Minima. Differenoes. 

£ t. ,d. £ s. d. £ I, 4L 

16 9 6) 



21 


4 


2 




... 






.. 




17 


16 


10 


20 


13 


2 




... 










17 





10 


19 


19 


9 




... 






•• 




17 


II 


9 



H 14 
3 7 

2 16 

3 " 
2 18 
2 8 



8 

4 
4 
4 
II 
o 



;£3 6 3 



Hence jCz* ^s. 3d. must be represented on the same scale as 1.6. 

This is making the hypothesis that a change of ;f i in the total 
trade per head synchronises with a change of .5 in the marriage 
rate per thousand. The scales so chosen are marked above and 
below the common average line in the right-hand figure. 

It is now seen that the fluctuations since 1880 lie more 
closely together in the two curves, but that this closeness has 
been obtained by the partial sacrifice of the years 1872-80, and 
there is now a complete disagreement before 1870. A yet 
shorter period, 1879- 1893, would show a very close agreement; 
but so special a selection would vitiate any general argument. 

Our conclusion is, that since 1870 the causes which affect 
foreign trade have also affected the marriage rate at the same 
dates and in the same sense, and that the more marked the 
effects on the one, the more marked are the effects on the other 
also, but that there is no law of simple proportion between them. 

Noie. — The relations tested by the middle diagram may be 

represented by the equation -=4, and that of the right-hand 



a 



diagram by — tjf = ^ (a .constant), where ;r and ^ stand for the value 

of trade and the marriage r^te, and a and d for their average 
values, and c is chosen so as to make the average fluctuations of 
the two sets of quantities equal. By the method of least squares 
c could be chosen so that the correspondence should be closer 
than with the value given by the calculation in the text. 

M 
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4. Periodic Figures. 

We now come to the consideration of periodic figures ; that 
is, of figures which within a given period, in a year for instance 
when returns are monthly, reach maxima and 
minima at assigned times, and show fluctuations 
recurring with regularity in successive periods. In physical 
phenomena, such as the sunrise, the same daily numbers will 
represent the phenomena, almost without change, year after 
year. In the case of the tides we find a link between the 
more rigid annual curves of seasonal phenomena, and the less 
marked periods of social statistics ; for the tides are subject to 
separate influences with periods of 24 hours, 24 hours 50 min., 
29 days, I year, and others, and the effects of these influences 
are often masked one by the other. In the weekly figures of 
the Bank of England, Jevons discovered monthly, quarterly, and 
annual periods.* 

In social and industrial statistics we usually find an annual 
period, combined with a general slow movement upwards or 
downwards, and confused by an irregular period of about ten 
years, due to alternate inflation and depression of trade. The 
influences of these three movements on the resulting numbers 
can be investigated, and the general methods of examining 
periodic figures fully explained by the complete discussion of one 
example, viz., the monthly returns of want of employment of the 
Friendly Society of Ironfounders. For another example the 
reader is referred to Jevons' essay. On the Frequent Autumnal 
Pressure in the Money Market ;* Rnd for an exercise, to the 
monthly gazette wheat prices, where the gradual change of the 
shape of the annual diagram can be traced in relation with 
the increasing influence of harvests in all the quarters of the 
globe. 

These figures are specially suitable for showing graphically 

a double period, and the influences of rapid annual fluctuations and 

General features general movements of longer period on each other. 

of the flguree. Looking at the table on p. 179 along the lines for 

the several years, it will be seen that there is always a fall in the 

middle of the year. Looking down a vertical column under any 

* See Investigations in Currency and Finance, 
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Number of Unemployed Ironfounders, expressed as percentages 
of estimated total number of members, month by month : calculated 
from figures given in the Annual Report of the Friendly Society of 
Ironfounders, 1894. 





























Aver- 


Year. 


Jan. 


Feb. 


Mar. 
14.0 


April. 


May. 


June. 


July. 


Aag. 


Sept. 


Oct. 


Nov. 


Dec. 


rJ^ 


1855 


II. I 


14. 1 


12.5 


10. 


9-9 


8.7 


8.7 


6.8 


7-7 


8.8 


12.0 


xa4 


1856 


10.9 


12.6 


12.2 


10.0 


9.4 


7.5 


6.9 


7.3 


6.9 


8.1 


8.7 


9.9 


9.2 


1857 


lai 


9-5 


8.7 


8.7 


8.1 


.u 


6.8 


6.9 


6.2 


8.0 


14.0 


^7-7 


9-3 


1858 


20.2 


20.6 


20.9 


19.8 


20.3 


159 


14*3 


13. 1 


II. 9 


ii>5 


II. 2 


16.S 


1859 


10.6 


8.8 


6.5 


5-2 


4.0 


4.4 


3.2 


3.6 


3-4 


3.8 


4.6 


5-1 


S3 


i860 


4.0 


3.2 


2.6 


2.2 


1.6 


1.7 


2.3 


2.6 


2.6 


2.9 


3-7 


5.6 


2.9 


1861 


6.0 


6.9 


6.5 


7.9 


7.8 


8.4 


6.9 


7.9 


9.5 


10.7 


12.4 


13.8 


8.7 


1862 


14.5 


14.0 


14.0 


14.6 


14.4 


13.7 


13.3 


12.9 


12.2 


13.5 


14.9 


16.0 


14.0 


1863 


15.5 


13.9 


13.6 


II.6 


10.4 


9.3 


8.1 


7.8 


7.4 


6.6 


5.3 


5-° 


9-5 


1864 


6.0 


7.1 


6.6 


5-3 


4.4 


3.3 


2.8 


2.8 


2.6 


3.3 


4.2 


8.1 





1865 


5.4 


5.3 


5-3 


4.6 


3-4 


2.9 


2.6 


3-^ 


2.7 


2.6 


2.3 


4.9 


1866 


4.2 


5.4 


5-1 


3.6 


5.1 


6.5 


5.9 


6.5 


6.9 


7.4 


9.3 


13.8 


1867 


12.4 


13.2 


'5-4 


^t-l 


14.9 


14.6 


14.2 


13.9 


15.7 


16.3 


18.9 


22.6 


iS-7 


1868 


22.1 


2a9 


19.8 


18.6 


16.7 


15.8 


14.9 


14.7 


14-2 


14. 1 


15.6 


17.4 


*7£ 


1869 


17.3 


17. 1 


16.8 


15.6 


15.2 


13.6 


^3.3 


11.8 


13. 1 


13.6 


14.8 


^§•3 


% 


1870 


14.5 


10.9. 


8.7 


7-5 


S.o 


4.5 


H 


4.5 


4.9 


5.0 


5.6 


8.3 


187 1 


7.2 


5.6 


3.6 


2.8 


1.6 


1.5 


1.6 


1.2 


.9 


1.4 


I.I 


2.2 


2.6 


1872 


I.I 


I.I 


.9 


.8 


1.2 


.7 


.9 


I.O 


1.3 


1.8 


2.6 


4.1 


1-5 


1873 


3.3 


2.8 


2.7 


2.5 


2.1 


2.0 


3.0 


4.9 


4.3 


3.3 


3-3 


5.1 


3.3 


Average 




























1855-73 


10.3 


xa2 


9.7 


8.9 


8.2 


7.7 


7.1 


7-2 


7.1 


75 


8.5 


10.4 


8.6 


1874 


4.9 


3.9 


3.9 


3-5 


4.9 


3.9 


3.8 


3.4 


3.5 


3.7 


3.9 


5.0 


4.0 


1875 


4.6 


3-4 


3.5 


2.8 


2.8 


2.8 


3.3 


H 


3.6 


4.1 


4.1 


5.0 


3.6 


1876 


4.9 


4.9 


4.9 


5.4 


4.8 


5.2. 


5.7 


5.8 


6.4 


6.4 


6.2 


10.3 


S9 


1877 


7.7 


7.4 


7.0 


6.9 


8.4 


7.6 


7.4 


7.8 


9.6 


10.9 


12.3 


16.3 


9.x 


1878 


14.0 


14.3 


13.5 


15.3 


13.3 


14.6 


13.6 


13.2 


13.3 


14.0 


18.0 


21.0 


14.7 


1879 


23.2 


23.8 


24.7 


25-5 


22.3 


23.4 


21.5 


22.6 


22.5 


21. 1 


16.6 


22. Z 


1880 


15-2 


12.9 


II. 1 


10.0 


10.0 


9.7 


9.8 


10.0 


10.0 


9.2 


9.2 


10.2 


10.6 


1881 


11.5 


10.8 


10. 1 


10. 1 


7.6 


u 


6.5 


5.8 


5.6 


5.4 


5-0 


6.6 


7.7 


1882 


5.5 


5.2 


5.3 


4.5 


3.6 


3-2 


3.4 


3.6 


4.1 


4.4 


6.0 


4-4 


1883 


3.6 


4.8 


5-2 


4.3 


4.2 


3.6 


3-9 


4.3 


4.3 


4.2 


4.0 


6.6 


4-4 


1884 


6.1 


6.2 


5.9 


6.5 


u 


6.9 


6.5 


7.6 


8.1 


7.8 


9.8 


10.9 


7.4 


1885 


10.2 


II. I 


10.0 


10. 1 


9.1 


9.8 


10.7 


11.8 


11.6 


12.7 


13.6 


za9 


1886 


14.1 


15.0 


152 


15-5 


13.4 


13. 1 


12. 1 


12.7 


13.6 


13.9 


12.7 


12.9 


13.7 


1887 


12.4 


11.6 


10.2 


9.1 


9.2 


10.6 


9.2 


8.8 


9.6 


9.4 


9.4 


9.1 


9-9 


1888 


7.8 


7.5 


6.4 


6.4 


5-9 


5.2 


5.7 


5.0 


5-1 


4.8 


3-2 


3.5 


S5 


1889 


3.1 


3.3 


2.4 


2.1 


1.7 


1.6 


1.7 


1.7 


1.6 


1.5 


1.2 


1.4 


1.9 


1890 


1-3 


1-3 


3.2 


3.1 


2.8 


2.4 


2.4 


2-7 


2.7 


2.7 


2.7 


2.7 


2.5 


189I 


3-9 


3.5 


4.2 


i'^ 


4.6 


4.0 


4.5 


4.8 


5-4 


5.6 


5-7 


6.3 


t? 


1892 


7.0 


7.2 


7.9 


8.1 


7.9 


7-9 


7.7 


7.6 


9.3 


1 1.4 


10.9 


12.0 


1893 


".5 


11.2 


10. 1 


7.7 


9.6 


8.3 


8.3 


9.2 


11.7 


11.9 


II-5 


".5 


10.2 


Average 
1874-93 
Average 
1855-93 


&6 


8.5 


8.2 


&z 


7-7 


7.6 


7.3 


75 


8.1 


8.2 


8.Z 


9.4 


&i 


9.4 


9.3 


8.9 


85 


7.9 


7.6 


7-2 


7.4 


7.6 


7.9 


8.3 


9.9 


8.3 
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month, it will be seen that there is no generally marked ten- 
dency towards increase or diminution, for high and low numbers 
occur in the first as well as the last few years. The most notice- 
able feature of these figures is the alternation of groups of years 
of high and of low numbers. Percentages above lo will be found 
in 1 86 1 -63, 1 866-70, 1877-81,1 884-87, and 1 892-93. Let us choose 
for examination the period 1866-70. The figure for January 
1866 is below the Januaries of previous years ; those of February, 
March, and April are also low; from May to September the figures 
are greater than those of 1865 or 1864 ; from October to Decem- 
ber they are greater than those of 1863, 1864, or 1865 ; in De- 
cember 1867 they are greater than any previous year. Most of 
the figures for 1868 beat the record up to that date; but from 
September 1868 the figure is lower than the one twelve months 
earlier till July 1872. This wave of unemployment then lasted 
from May 1866 to September 1872. 

Now let us watch the seasonal influence. In 1866 there 
was no fall in the summer except in April, and there was a very 
a^ngonai rapid rise in December. In 1867 a fall in May 
loflneooe. and a slight fall from June to August was followed 
by a rapid rise in November and December. There is a fall 
from December 1867 to September 1868, but a rise follows in 
October, November, and December; since the rise does not 
generally begin till August, it will be seen that the general 
fall did not much delay the seasonal effect. In the next year, 
1869, there is a fall to a lower minimum in August, but now 
the rise in December is very slight, next year the fall is very 
quick to August, but the seasonal rise is not delayed. From 
this it is clear that the seasons had their effect throughout the 
fluctuation except in the opening year 1866, when there was 
no fall, and that the rises in the autumn were very much 
accentuated. Almost identical remarks would apply to the 
period August 1875 to May 1881. In what month was the 
depression of trade 1867-70 at its worst? The greatest figure 
given is 22,6 per cent, in December 1867, but unemployment 
in December is generally greater than in any other month, and 
the figures for any of the following six months may be more 
unusual ; the determination of the exact date will be best shown 
by diagrams. It may be mentioned that most of these remarks 
were suggested by Mr Hey, the former secretary of the Iron- 
founders' Socitey, who drew up these figures. 
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If we now turn to the diagram, the following facts may be 
noticed. The thick line showing the annual average percent- 
Th6 story from ages shows a downward tendency till 1857, fol- 
tbe diftgram. lowed by an abrupt rise and fall in 1858, then 
three years' rise to its original height, returning to a minimum 
in 1865 ; the next wave covers six years, and is marked by an 
extraordinarily sharp rise in 1867, and a very low minimum in 
1872. The exceptional condition of trade in 1872 could not 
last, but the rise is very gradual to 1876, when the next cycle 
of trade is marked again by a six years* wave: the rise is 
not so steep as in the former fluctuation, but lasts longer, and 
a higher point is reached : the fall is at about the same angle, 
and the minimum in 1882 is about the same as that in 1865. 
The next wave came before it appeared to be due, and lasted 
seven instead of six years, but was much more moderate, and 
again the rise was sharper than the fall. The minimum of 
1889 did not endure, and the figure ends with a suggestion 
that the maximum will be in 1894, but only at a moderate 
height, and the next minimum might be expected in 1898 
or 1899, if causes similar to those which influenced earlier trade 
depressions were still acting. It may be found, in fact, from 
the Board of Trade returns, that, taking all the trade unions who 
made returns together, the maximum month was December 1892, 
and the maximum year was 1893 J after this the fall is regular 
to 1897, and a trifling rise in 1898 is followed by a very low 
figure for 1899.* 

In figure 5 the diagram is inverted and greatly compressed, 
showing now the percentage employed. If the period 1876-82 
is cut off by two vertical lines, readers may see how great were 
the amounts of labour lost to the country and wages to the 
workers in those years, and will agree with Professor Foxwellt 
that irregularity of employment is one of the greatest evils 
endured by the working classes. 

In figure 5 the annual averages are smoothed by the method 
explained above (p. 152), a seven -yearly average J being taken 



* Sec Annual Abstract of Labour Statistics^ 1895, p. 73, for various 
methods of treating these figures similar to those here discussed. 

t See Lectures on the Labour Question, 1886. 

{ For smoothing and studying periodic curves, see Professor Poynting's 
paper in Statistical Journal, 1884. 
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to correspond to the general wave length. It will be se^n that 
there is no v^ry marked tendency up or down in the thirty-nine 
years, and that the smooth line is never far from the general 
average of employment, 91.7. 

The comparison of this diagram with that illustrating ex- 
ports (p. 151) is very instructive. Some of the results may 
be thus exhibited : — 





Dates 


OF 




Dates 
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Nrintma 




Maxima of * 


Maxima 




Minima of 


of Exporu. 






of Exports. 




Unemployment. 


1862 




1858 and 1862 


1866 




1865 


1868 




1868 


1872 




1872 


1879 




1879 


1882 




18S2 or 1883 


1886 




1886 


1890 




1889 - 


1894 




1893 









The figures may also be compared graphically by the methods 
of the previous or following sections. 

The averages for the nineteen Januaries, nineteen Februaries, 
&c., in the years 1865-73, ^1"^ similar averages for the years 
Measunment ^^74"93> ^^^ ^^^ whole period are given in the 
of saMonai table and e?#hibited in figures 2, 3, 4. When we 
"*"^* calculated the annual averages just discussed wc 
eliminated by that process the seasonal fluctuations ; by this 
new series of averages we eliminate the influences of particular 
years. If we took, for instance, all the November numbers out 
of a series of figures totally uninfluenced by the seasons, if such 
could be found, and compared these with the general average 
for all months, we should in the long run find just as many 
instances above as below this average ; but if the figures were 
influenced by the seasons, we. should find a considerably greater 
number above than below, or vice versa. The greater the 
seasonal influence, the greater would be this excess or defect. 
Averaging numbers in this way eliminates the non-seasonal 
causes, for by hypothesis the excesses and defects due to them 

^will in the long run balance one another ; and except by 
averaging these cannot be eliminated, unless they can be actually 

.-calculated. The excess of tHe November average above the 
general average will be greater than that of October, if the- 

^seasonal causes exert more influence towards excess in the 
former than in the latter month, and the curve which shows > 
these averages will show a resemblance to that which would 
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be obtained, if the non-seasonal causes were absent It will- 
_,be only a resemblance for two reasons: first, because in the 
comparatively short series of years with which we are generally^ 
obliged to be content, a very effective non-seasonal cause will 
leave its mark on the average, as may be seen in the table on~ 
' p. 179 ; secondly, because seasonal and non-seasonal causes are 
often not independent ; a depression of trade is accentuated by > 
a sharp winter ; a bad season in a year of bad trade may increase 
the want of employment greatly and suddenly, while a good- 
-summer in a prosperous year may reduce it almost to zero. 
In the case we are considering the interaction of causes tends - 
to exaggerate the seasonal maximum and diminish the mini- 
mum ; in other cases a contrary effect might be found, s 
' In figures 2, 3, 4 the curve for the latter half of t;he year 
is prefixed to that of the calendar year, because the character- 
of the yearly waves is seen most clearly from minimum to 
minimum. It may be noticed that the wave in figure 3 is^ 
-less definite in shape and has a smaller rise and fall than that 
of the earlier period shown in figure 2 ; it would appear that- 
the seasons are losing their influence. 

- If there is a definite annual period, that represented by 
figure 4, it may be expected that a figure of a shape similar^ 
to this — 

5 w^mam^m^^am 5 




will be repeated annually in figure i ; it is shown well in 1864, 
1882, and other years. In the great majority of cases the yearly^ 
Tbeanniui maximum is reached in December or January ; at 
wave. the end of 1858 the maximum is absent, but is 
replaced by a break in the rapidity of the fall ; at the end- 
of i860 there is a rise, but the spring fall following is checked 
by the general upward trend ; similar remarks apply to all - 
the great fluctuations. There is no doubt that right along the 
line we find at nearly equal intervals these pointed crests above- 
the line of averages. 

The minima are not so conspicuous, for the pointed shape 
is absent, trifling causes bring them near the smoothed line, and- 
they are easily masked by a general fall or are absent because 
of a general rise. In 1861, however, there is a distinct minimum 
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in spite of the strong upward tendency ; the minima are very 
conspicuous throughout the fluctuation of 1865-70; and from 
1859 to 1888 the minima are fairly marked, except in 1876, 
1880, and 1 88 1. 

The following figures show the effect of a stationary, rising, 
and falling average annual rate on the shape of the seasonal 
wave : — 

a. Seasonal wave on stationary line of averages. 

5 jm^m^a^K^m^B^^ammma^ s 




Dec. I Jan, 
b. Seasonal wave superimposed on rising line of averages. 

15 ^^^^^■^■^■^■^IHH 15 



10 



^^i 


^HHH^I^^^^^i^H 


^^R 


^SH 


^^b 


^^1 



10 



Jan. 



Dec I Jan. 



Dec. 



c. Seasonal wave superimposed on falling line of averages. 

<5 ^■■■■IHHil^^^HBHIH 15 



10 




10 . 



These figures are drawn by adding or subtracting the average 
monthly differences from the general average 

(\\i^ Jan. Feb. Mar. Apr. May. June. July. Aug. Sept. Oct. Nov. Dec\ 
\ +1.1 +1.0 +.6 +.2 -.4 —.7 —I.I -.9 -.7 -.4 o +i.6y 

month by month to or from the positions shown on the straight 
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lines joining the annual averages. On a rising line the spring 
fall tends to become horizontal and the autumn rise steeper; 
on a falling line the spring fall becomes more rapid and the 
autumn rise is checked. 

If this seasonal wave, added to the slower long-period 
changes, were the complete explanation of these numbers, 
figure I (p. 179) would be entirely composed of modifications 
of figures tf, ^, and c. Figure a is exemplified especially 
in 1855-57, 1864-65, 1871-73; figure b in 1860-61, 1866-67, 
1877-78, 1883-85; figures in 1859, 1863, 1880-82, 1886-89. 

As explained above, the two sets of causes are not indepen- 
dent, and these figures are not reproduced exactly; but the 
BUmuifttion of resemblance is sufficiently close to make the 
fluotnatioiis. following method of eliminating seasonal fluctua- 
tions partially applicable. Combine the monthly excesses and 
defects just given with the original numbers, by subtracting the 
excesses and adding the defects ; this process should tend to 
produce a straight line, thus : — 




from figure I. 
corrected f^ures. 



But the result is not more than a tendency, because of the 
unusual fall in January 1883, and it is difficult to find a perfect 
example. This method is applied in figures 6, 7, and 8 in an 
attempt to disentangle the seasonal fluctuations from the effects 
of the commercial crisis of 1872, the depression of 1879, and the 
turn of the tide in 1883. In figure 6 it is seen that January 1872 
was the best month relatively, though the absolute minimum 
was not reached till June of that year ; from this it appears that 
January 1872 was the turning point of the great inflation, a date 
somewhat earlier than that generally given. The date of the 
maximum of 1879 is left unchanged by this process, and that of 
the 1889 minimum is only shifted one month. 

We have still to discuss the criteria of the existence of a 

period. In figure i the optical evidence is sufficient to suggest 

oriieriaofezifi- the annual period, but it may be doubted whether 

enno of period, an annual fluctuation would be suggesteci by a 

diagram representing wheat prices. It is clear that if the 
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monthly entries of any returns whatever were averaged in 
months over any period of years, that the averages for January, 
February, &c., would not be exactly equal, even if there were 
no seasonal influence. The following diagrams show various 
averages : — 



Unemployed ironfounders 
as before. 




June 



Wheat prices shillings per 
quarter, 1877-91. 




40 



Wheat prices, shillings 
per quarter, 1862-76. 




SO 



25 



Dec. 



Average date of first Sunday 
in month, 1881-1900. 



Dec. 




Of these the first three may be expected to be seasonal, while 
the last, which shows the averages of the dates on which fell the 
first Sunday in 20 Januaries, 20 Februaries, &c., in a series of 
years, certainly is not 

The following simple tests may be applied to decide this 
point. If the period is in any way connected with the seasons, 
it will correspond to some extent to the ordinary weather charts 
of temperature, &c., which have a single annual maximum and 
corresponding minimum. Phenomena affected by the weather 
may also be expected to show a single maximum, nearly coin- 
ciding with the maximum or minimum temperature ; thus the 
maximum unemployed coincides with the minimum length of 
daylight and precedes the minimum temperature. In some 
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cases a second subsidiary maximum may be shown, since, for 
example, an excessive death rate may be due to excessive cold 
or heat ; but even in this example further analysis would pro- 
bably show that the one maximum was for the old, the other 
for the young. Wheat prices may also show two minima due 
to the harvests in the two hemispheres. The "Sunday" curve 
just given shows four maxima, and is not seasonal. More than 
one maximum is evidence against periodicity till their existence 
can be explained. 

The second test is to look at the serial diagram and notice 
how often the maximum occurs in the same month; non-periodic 
Protebiuty causes will hide the maximum occasionally, but in 
**■*• the long run one month will be predominant. In 
figure I the maximum occurs in March and April twice each, 
in February three times, in January eleven times, and in Decem- 
ber twenty-one times. The maximum is then generally in 
midwinter. The minimum is not in this case so well defined. 
The following table shows how this analysis can be ex- 
tended : — 

Times 
out of 39. 
The percentage of December is greater than that 

of the preceding November - - - - 33 
The percentage of December is greater than that 

of the following January - - - - 28 

The percentage of December is greater than that 

of the preceding July 33 

The percentage of December is greater than that 

of the following July 30 

The chances against so great a preponderance, if the seasons 
had no influence, are respectively 70,cx)0 to i, 106 to i, 70,000 
to I, and 940 to i.* All the months may be separately tested 
in the same way. This method by no means exhausts the 
evidence, for we have only considered which of two months 
is the greater, and not how great is the excess when it exists. 
On this point the reader is referred to the paper by Professor 
Edgeworth, On Methods of Statistics, in the Jubilee Volume 
of the Royal Statistical Society, p. 206 ; this should, however, 
be postponed till the mathematical treatment which follows in 
Part II. has been studied. 

♦ See Part II., Sect. I., iftfra. 
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5. Logarithmic Curves. 

A serious flaw in the graphic method as used in the previous 
sections is that, when we are dealing with a series of increasing^ 
NmHiforgrapiiio figures, though the totals year by year may be 
reprMentotion increasing, we are compelled to represent equal 
^ ^ increments on these totals by equal vertical dis- 

tances ; thus an increment of ;f 20 on a total of ;^20 is repre- - 

^sented by the same vertical distance as an increment of ;6'20 on 
a total of £2,000. Thus in the annexed figure representing 
exports, the fall from ;fS 2,000,000^0 ;f42,ooo,ooo in 1815-16 is 
barely noticeable, though it is a fall of 20 per cent, and was - 
cpnnected with very great distress in the manufacturing dis- 
tricts ; while the fall from j^305,ooo,ooo in 1883 to ;^269,ooo,ooo^ 
in 1886 attracts attention immediately, though it is -one of 
12 per cent. only. Again the increase of 34 per cent which*" 
took place between 1848 and 1850 appears insignificant in com- 
parison with that of 29 per cent from 1870 to 1872. When we • 

^are attacking questions of causation it very frequently happens 
that we are more concerned to know the pro portionate increase ^ 
than the actual increase. When we are considering the gradual f 

/growth of our foreiglT trade, or when we are comparing the.- 

-growth of trade of two countries, a diagram like that annexed 
is likely to give quite a wrong impression of the struggle that' 

- marked the early stages. We need then a jiag rgm not o f 
quantities , but of ratios , where equal vertical distances represent • 

^. no" longer equal absolute increments, but equal proportional ' 
increments, that is, equal rates of increase. By the use of-^ • 
-logarithms a universal scale can be constructed which serves 
this purpose. The non-mathematical student can easily accustom - 
himself to the use of diagrams so constructed, by studying one 
where the actual amounts represented are entered, and noticing- 

- that whatever part of the scale he takes, doubling, halving, in- 
creasing by 20 per cent and so on, are always represented by the- 

-same vertical distances respectively. The construction of a 

oomtmotioxi of diagram on this scale is as follows : — Write down 

a logaritimiio the numbers in the series to be represented; 

" against them write down their logarithms; on 

paper divided into equal squares mark at equal intervals on a. 
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? 
vertical line numbers ascending in regular prc^ression so as to' 

^include all the logarithms found ; mark off the dates on a 
horizontal . line ; and on the scale thus prepared mark in- 

-the logarithms, instead of the original numbers. The table oh 
p. 191 and the diagram facing p. 190 show the figures of imports- 

-and exports thus treated. On the right hand of fig. 2 the positipn 
of the absolute numbers is given ; on the left the correspond- '^ 
ing logarithms. A given vertical distance, i inch, represents 
the distance .301 on the logarithmic scale; if we add this- 
(Quantity to the logarithm of any number, we obtain the 
logarithm of twice that number for log a + .301 = log a ^^ 
+ log 2 = log 2a \ for instance, if we increase the height of 
the position which represents £30 by i inch, we arrive at the- 

- position which represents ;f6o. Again if we now add 1.59 of an 
inch, which represents .477 on the same scale as before, that is- 
log 3, to the logarithm of 2a, we obtain log 6a,_ and we have — . 

log 6a =s ,477 + log 2a = .477 + .301 + log a, as above 
= .778 + log as log 6 + log a ; 

that is, we arrive at the same position on this scale whether we 

go by means of two separate ratios or by a single compounded* 
-ratio. Thus a diagram drawn on this principle satisfies the 

necessary conditions that equal vertical distances represent the- 
- same process in whatever part (rf the scale they are taken, and 

that any number of points can be entered without leading to** 
-inconsistencies. At the end of this section is given a table of 

the logarithms of i to 1,000, correct to the third decimal place,^ 

which will be found sufficient for this purpose. 

Thus on the diagram given jve can see at once that imports 
.were doubled in value between 18 10 and 1836, again between 
BsaaiyiMof 1S40 and 1 85 3, again between 1855 and 1866, 
itevM. and that their value increased 40 per cent, be- 
tween 1886 and 1899. Or we may notice that the excess of the ^ 
--value of imports over that of exports was 40 per cent, of the 

latter both in 1850 and in 1880; that the value of imports in^ 

1899 was thrice that of exports in i860. 

If the eye has been carefully educated to understand a 
-'diagram of this sort, if the fact that it is a diagram of ratios, 

not of quantities, is firmly impressed on the mind, then the*^ 
^diagram , answers perfectly the object of the graphic method, 

that is, it gives a true instantaneous impression of a comfJex -> 
-series of facts. If, on the other hand, it is found that a true 

Digitized by V^jQOQlC 



I90 ELEMENTS OF STATISTICS. 

impression is not received, through inability to take the right— 
•mental position, then diagrams on the natural scale should be 

employed only, always with the recollection that they may give - 

false impressions of ratio.* 

It is to be noticed that no base line should be given in 

diagrams of this class, otherwise a false impression is at once 
Velocity and obtained. Notice further that, while equal verti- 
aootfierauon. ^^1 differences represent equal ratios from any 

part of the diagram to any other, instead of equal increments as ^ 
-on the natural scald, equal degrees of slope represent equal ratios 

of increase (equal accelerations), instead of equal additions in— 
-equal times as on the natural scale (equal velocities). On the 

logarithmic scale a line rising with convexity to the horizontal- 
shows that the ratio of increase is growing, as in imports from 

1830-1853 (if the line is smoothed), while concavity, as from 1854— 
' to 1873, shows a slackening ; but on the natural scale the line is 

convex almost throughout the two periods, showing that the * 

actual increments were increasing all the time. 

It would be useful, if space permitted, to offer several 

diagrams on both scales ; for in many series of figures the 

uiefni avpu- diff^^^^nces exhibited by the two methods are very 

oation to index- instructive. One case may be signalized where the 

nnmtara. logarithmic scale* is specially important, that is, 

when the original numbers represent ratios, not actual numbers.- 
-Thus in Mr Sauerbeck's well-known diagram, drawn on the 

natural scale, representing his index-numbers of prices, all the - 
-numbers included are percentages of their values in ceitein 

defined years. Suppose that ^ 100, 80, and 60 are the index- - 
•numbers for three years, then on the natural scale the decre- 
ments are represented by equal distances and appear to be- 
-equal. The falls in the value of gold, however, are by no means 

equal in the two periods. In the first, the fall from 100 to 80- 
^is one of 20 per cent. ; i6s. at the second date would buy goods 

which cost £1 at the first. In the second, the fall from 80 to 6o* 
--is one of 25 p^er cent. ; iss. at the last date would buy goods 

which cost £1 at the middle date. For the purposes of price* 
« index-numbers it is ratios which are important and which the 

diagram should represent 

♦ Professor Marshall suggests a simple method of correcting this false 
impression in his paper On the Graphic Method of Statistics, in the jubilee 
volume o^ iht Journal of the Royal Statistical Society^ p. 257 seq. 
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Year. 


Im- 
ports.* 


Logarithms. 


Ex- 
ports, t 


Logarithms. 


Year. 


Im. 
ports.* 


Logarithms. 


Ex- 

ports, t 


Logarithms 


1800 


28 


1.447 


34 


I.53I 


1850 


100 


2.000 


71 


I.851 


1801 


31 


1.491 


... 




1851 


III 


2.045 


74 


1.869 


1802 


29 


1.462 






1852 


109 


2.037 


78 


1.892 


1803 


26 


I.415 


... 


... 


1853 


123 


2.090 


99 


1.996 


1804 


^J 


I.431 






1854 


152 


2.182 


116 


2.064 


180S 


28 


1.447 


38 


1.580 


1855 


144 


2.158 


117 


2.068 


1806 


27 


1.431 


41 


''^il 


1856 


173 


2.237 


139 


2.143 


1807 


27 


I.43I* 


37 


1.568 


1857 


179 


2.252 


146 


2.164 


1808 


27 


1.431 


37 


1.568 


1858 


165 


2.216 


140 


2.146 


1809 


32 


1.505 


^l 


1.672 


1859 


179 


2.252 


156 


2.193 


1810 


39 


I-59I 


48 


I.681 


i860 


211 


2.324 


165 


2.217 


1811 


27 


» 1*431 


33 


I.518 


1861 


217 


2.336 


166 


2.204 


1812 


26 


■ I-4I5- 


42 


1.623 


1862 


226 


2.354 


166. 


2.220 


1813 


... 


... 




... 


1863 


249 


2.396 


197 


2.295 
2.328 


1814 


34 


1.531 


45 


1.653 


1864 


275 


2.439 


213 


1815 


32 


'•5?l 


52 


I.716 


1865 


271 


2.432 


219 


2.340 


1816 


37 


1.568 


42 


1.623 


1866 


295 


2.470 


239 


2.379 


1817 


31 


''^Va 


42 


1.623 


1867 


275 


2.439 


226 


2.354 


1818 


37 


1.568 


46 


1.663 


1868 


295 


2.470 


228 


2.358 


1819 


31 


I.49I 


35 


1.544 


1869 


295 


2.470 


237 


2.375 


1820 


32 


1.505 


36 


1.556 


1870 


303 


2.481 


244 


2.387 


182I 


31 


I.49I 


37 


1.568 


187 1 


331 


2.519 


284 


2.453 
2.498 


1822 


31 


I.49I 


37 


1.568 


1872 


355 


2.550 


315 


1823 


36 


1.556 


H 


1.544 


1873 


371 


2.569 


3" 


2.492 


1824 


37 


1.568 


38 


1.580 


1874 


370 


2.568 


298 


2.475 


1825 


44 


1.643 


39 


1. 591 


1875 


374 


2.573 


282 


2.450 


1826 


38 


1.580 


32 


'•1 


1876 


375 


2.574 


257 


2.410 


'l^l 


45 


1.653 


37 


1877 


394 


2.596 


252 


2.401 


1828 


45 


I.6S3 


37 


1.568 


1878 


369 


2.567 


245 


2:389 


1829 


^ 


'•^3 


36 


1.556 
1.580 


1879 


363 


2.559 


249 


2.396 


1830 


46 


1.663 


38 


1880 


411 


2.614 


286 


2.456 


1831 


50 


1.699 


^l 


1.568 


1881 


397 


2.599 


297 


2.473 


1832 


^l 


''^JJ 


36 


1.556 


1882 


413 


2.616 


307 


2.487 


1833 


46 


''^ 


40 


1. 602 


1883 


427 


2.630 


305 


2.484 


1834 


49 


1.690 


42 


1.623 


1884 


390 


2.591 


296 


2.471 


'Pi 


49 


1.690 


47 


1.672 


1885 


371 


2.569 


271 


2.432 


1836 


57 


1.756 


53 


1.724 


1886 


350 


2.544 


269 


2.429 


'Pi 


55 


1.740 


42 


1.623 


1887 


362 


2.558 
2.586 


281 


2.448 


1838 


61 


1.785 


50 


1.699 


1888 


388 


299 


2.476 


1839 


62 


1.792 


53 


1.724 


1889 


428 


2.631 


3i6 


2.500 


1840 


67 


1.825 


51 


1.708 


1890 


421 


2.624 


328 


2.516 ' 


184I 


63 


1.799 


52 


1. 716 


1891 


435 


2.638 


309 


2.490 


1842 


64 


1.806 


47 


1.672 


1892 


424 


2.627 


292 


2.465 


1843 


68 


1.832 


52 


1. 716 


1893 


405 


2.607 


277 


2.442 


1844 


74 


1.869 


59 


1,771 


1894 


401 


2.603 


274 


2.439 


'Pi 


83 


'•?39 


60 


1.778 


1895 


417 


2.620 


286 


2.456 


1846 


P 


1.863 


58 


1.763 


Z896 


442 


2.645 


296 


2.471 


'Pi 


P 


1.919 


59 


I.771 


1897 


451 


2.654 


294 


2.468 


1848 


89 


1.949 


P 


1.724 


1898 


471 


2.673 


294 


2.468 


1849 


100 


2.000 


64 


1.806 


1899 


485 


2.685 


320 


2.505 



* Imports— Official values till 1853 ; real values from 1854. 
t Including re-exports. 



Digitized by 



Google 



192 ELEMENTS OF STATISTICS. 

The logarithmic scale has special uses in the comparison of^ 
-^series of figures, and the methods discussed in the section 
oompMisoiii on ^^^oted to that subject can be readily adapted, 
the logaritimiio The difficulty of the choice of units in comparing 
*^*' quantities of different natures disappears when we 
deal only with ratios; we need no longer trouble about the ^ 
-^method of percentages. In investigating causal relations we are 
more likely to find close connection in ratios than in quantities^ 
--for if one set of phenomena are connected with another, it is 

more likely that the relation will be a proportional one (^^., *- 
—that an increase of lo per cent, in some measurable charac- 
teristic of the one corresponds to an increase of 8 per cent in* 
^ a characteristic of the^ other), than an absolute quantitative one 

(^^., that an increase of 2s. in a price, at whatever point it- 
^stands, corresponds to a decrease of lOO in the number of 

purchasers). Resemblance between two curves on the loga- ■• 
^ rithmic scale will mean the correspondence in proportional 
change, while resemblance on the natural scale means corre- • 
spondence in absolute change. 

There is less trouble in this new method in equating averages 
^ than before. For if the logarithms of two series are taken, it is 
quite immaterial at what height on a logarithmic scale the two^^ 

- are plotted out ; alteration of height only means multiplication 
of all the items by a constant quantity, and does not alter the — 

^ appearance or proportion of their fluctuations. The method to 

be employed is as follows : — Draw the curves representing two-* 
^ series of figures on a logarithmic scale ; then shift the lower 

curve vertically upwards to and ov^r the other, till the closest- 
^ possible correspondence is obtained ; draw it in in this position, 

and the two series can be accurately compared. 

The following example employs this method with a further ^ 
— development, corresponding to that of p. 177, supra^ where 
Bquattonof fluctuations are equated. In the earlier method 
flnotnauont. ^fJ^ used the average as a position from which to 

measure the various items, and adapted the scales ; a similar-- 
^ method might again be used, but it is more convenient to keep 

to one logarithmic scale, and now we have no base line to- 
^ consider. Calculate the fluctuations much as before, but express 

them as percentages of the adjacent maxima before taking their- 

- average. In the following example it is found that a fluctuation 
of 84 per cent in the number employed, in those trade unions 
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MARRIAGE RATE AND EMPLOYMENT. 

Fig. I. Comparison in 1865-93. 

On Natural Scale. 




Ytars 



1870 



1680 



1883 



Fig. 2. The same ; Logarithmic Scale. 




Y«ara 1870 



1880 



1890 



Fig. 3. Comparison in i88o-iS96. 




Year* 1880 1890 1896 

.Marriage rate ......... 
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whose returns are accessible,* corresponds to one of 9.7 per cent, 
on the marriage rate. To investigate a possibly closer corre- 
spondence, assume that a portion of the number employed do 
not influence the marriage rate,, and find what part must be 
subtracted before this 8.4 per cent, of the total forms as much 
as 9.7 per cent, of the remainder; the average percentage of 
members of the trade unions at work in the selected period was 
95.1 ; 8.4 per cent, of this is 7.99, which forms 9.7 per cent, of 
82.4. Thus 12.7, the difference between 95.1 and 82.4, may be' 
considered as not influencing the question, and subtracted 
throughout before logarithms are taken. This process would 
be replaced on the natural scale by equating the averages of 
two series, and drawing one base line so far below the other 
that average fluctuations would be represented by the same 
vertical distance for both series ; which process is exactly 
equivalent to that adopted on p. 177. Expressed algebraically, 
we are now investigating the equation — 

log (y~c)- log jr = ir, a constant, 
where c and k are constants to be so selected as to give the 
closest fit, and^ and x are the quantities to be compared. 

In the following diagrams, figure i gives the figures in the 
natural scale ; figure 2 gives them on the logarithmic scale, after 
they have been arranged so as to make average percentage 
fluctuations equal ; while in figure 3 the shorter period, 1880-96. 
is treated in a method precisely similar to that of figure 2. 

* The figures in columns 2 and 4 in the second table on the next page are 
taken from Mr G. H. Wood's paper on Some Statistics of Working Class 
Progress since i860, Statistical Joumaly 1900, where a valuable logarithmic 
diagram will be found, illustrating many of the points of this section. 



N 
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Average percentage employed, 1865-93, 95.1 ; 8.4 per cent, of 95.1 is 9-7.per 

cent, of 82.4. 



MaRRIAGB RaTB per XjOOO. 


Perckntagb Employed. 


Yean. 


Maxima. 


Minima. 


Differ- 
ences. 


Max. 


Yean. 


Maxima. 


Minima. 


DiflTer- 
ences. 


Lt 


1869 


... 


IS.9| 


1.7 
3.2 
1. 1 


10 


1868 


... 


9i.5| 


7.4 


! 
7.5 


1873 


17.6 


... - 


18 


1872 


98.9 


... 


It.4 


ii.S 


1879 


... 


14.4 


7 
8 


1879 


... 


87.5^ 


10.6 


ia8 1 


1882.83 


15.S 




1.3 
1.4 

•9 


1882 


98.1 




7.6 


7.8 ; 


1886 


... 


14.2/ 


9 
6 


1886 




90.5I 


7.4 


7.6 


1891 


15.6 


... 


1889-90 


97.9 


... 


5-4 


5-5 


1893 




14.7/ 




1893 




92.5/ 






9.7 


8.4 



Yean. 
1865 


Marriage 
Kate. 


Logarithms. 


Percentage 
Employed. 


Less ia.7. 
85.3 


Logarithms. 


»7.S 


1.243 


98.0 


I-931 


1866 


^7-5 


1-243 


96.9 


84.1 


1.925 


1867 


16.5 


1.217 


92.7 


80.0 


1.903 


1868 


16. 1 


1.207 


9I.S 


78.8 


1.896 


1869 


iS-9 


1. 20 1 


92.6 


79.9 


1.902 


1870 


16. 1 


1.207 


95-7 


83.0 


I.919 


1871 


16.7 


1.223 


98.2 


Sl-5 


1.932 


1872 


17.4 


I 240 


98.9 


86.2 


1-935 


1873 


17.6 


I.24S 


98.7 


86.0 


1.934 


1874 


17.0 


1.230 


98.2 


IH 


1.932 


1875 


16.7 


1.223 


97.5 


84.8 


1.928 


1876 


16.5 


1.217 


96.4 


53-7 


1.923 


1877 


15.7 


1*196 


95.6 


82.9 


1.919 


1878 ^ 


IS-2 


1. 182 


93-7 


81.0 


1.908 


1879 • 


14.4 


1. 158 


87.5 


74-8 


1.874 


1880 


14.9 


1.173 


94.1 


81.4 


1.911 


1881 


I5.I 


I. '79 


96.5 


83.8 


1.923 


1882 


»5-5 


1. 190 


98.1 


85.4 


1.93J 


1883 


15. 5 


1. 190 


97.8 


85.1 . 


1.930 


1884 


15.1 


1. 179 


92.6 


79.9 


1.902 


1885 


14.5 


1. 161 


91.0 


78.3 


1.894 


1886 


14.2 


1. 152 


90-45 


77.7 


1.890 


Z887 


14.4 


1. 158 


92.6 


79-9 


1.902 


1888 


14.4 


1. 158 


95.2 


82.5 


1. 916 


1889 


15.0 


1. 176 


97.9 


85.2 


1.930 


1890 


15-5 


1. 190 


97.9 


IH 


1.930 


1891 


15.6 


I- 193 


96.S 


83.8 


1.923 
1.908 


1892 


15-4 


1. 187 


93.7 


81.0 


1893 


14.7 


1. 167 

Average 
1. 196 


92.5 


:f9.8 


1.902 

Average 
I.916 
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CHAPTER VIIL 
ACCC/I^ACK 

INTRODUCTORY; 

There is not in existence a perfectly accurate measurement,- 
.physical or economical, just as there is no perfectly straight line 
The nature or or perfect fluid. We can best illustrate the nature 
meaforement. q{ economic measurements by considering that of 
physical. It is easy to weigh substances accurately to i gram :- 
then by obtaining a good balance, we can, as our apparatus is 
improved, weigh accurately to a centigram, milligram, and one- - 
-tenth of a milligram ; but for accuracy beyond this the balance 
fails us. Similarly in measuring angles, the naked eye can • 
-distinguish an object which subtends one-thirtieth of a degree ; 
with a sextant a measurement can be taken correctly to fifteen^ 
seconds of arc ; the Greenwich astronomers can make observa- 
tions correct to one-hundredth part of a second, but we again- 
come to a point beyond which precision is unattainable. 

In such cases the result is stated as correct to a milligram, 
-or whatever it may be ; in the same way we speak of an esti- 
mated sum of money correct to a pound. 

A task which has considerable resemblance to some statis- • 
tical estimates, is the measurement of the parallax of the sun, 

^_ , - which determines its distance from the earth. 
Fliyiloal and 

lUtiatioai During the eighteenth century astronomers esti- 
^°***'"'^®^**' mated it as lo", equivalent to 96,000,000 miles. 

As methods of observation and instruments were improved,- 
* observers began to agree that the whole number of seconds was 

8, but gave various estimates for the first decimal figure. Since - 
' 1865 there have been very few estimates which have not given 8 

as the nearest figure for this place (8.8"), while more recent^ 

observations agree in making the parallax from 8.76" to 8.78". 

We may, therefore, consider that the distance is now accurately* 
. known to within i in 400. Notice in this connection, first, that 

the earlier observations have been subject to corrections ; * 
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secondly, that better agreement has been attained as time has 
gone on ; thirdly, that neither absolute agreerfient nor ab- 
solute accuracy have yet been obtained. SoM is with statistical 
measurements ; we might instance the gradual settlement of the 
curve representing expectation of life, the measurement of the 
fall in prices, and the development of wage statistics. 

Again in physical measurements, though we can sometimes 
reach a very high degree of accuracy, as, for instance, in the 
Degrees of po8. weight of a cubic foot of water whicK could doubt- 
B»ie aoouraoy. iggg \^ known correctly to one part in a million, in 
other cases we are glad if we can measure to one part in^en, as, 
for instance, in the distance of the nearest fixed star from us, 
which is, roughly, from 34 to 37 billion miles. So in statistics 
it is something if we know that the total capital of the United 
Kingdom was between 7^ and 10 thousand million pounds in 
1885, or if we know that the* average, weekly- wage of working- 
men in full work was from 21s. t6 27s. in 1886. The weak point 
in such statements is that often when we have made an estimate, 
which we know to be inexact, we are not able to give any esti- 
mate of the limits of the error. We are not so definite as TAe 
Modern Traveller who 

"... knew the weather to a T, 
The longitude to a degree, 
The latitude exactly.'' 

We are not able to say "our estimate is 24s. sd., we are pot 
certain to id., but it is not possible that we are as much as 
IS. wrong " ; whereas in physical measurements we can often give 
the result correct to the smallest graduation of the instrument 
employed. 

On the other hand, though we cannot obtain exactness, we 
can in many cases estimate to that degree of accuracy which is 
The aocnraoy required for practical purpose. In common use 
generally needed. Q^\y ^ certain conventional accuracy is needed. 
Thus, to take some miscellaneous instances, the area of an estate 
is given in acres, roods, and poles, but not correct to square 
yards ; the market prices of shares do not change less than 
yV; we keep the day, not the hour, of our birth; railway 
time-tables do not show seconds ; ocean steamers are timed to 
start at certain hours, not minutes ; height is measured correct 
to one-tenth of an inch ; a hundred yards race is timed to one- 
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tenth of a second. Similarly in statistical estimates, we seldom 
need that our results shall be accurate within one per thousand, 
or even i per cent One per thousand of the working week is 
only three minutes ; i per cent, of the week's wage is only .3d. 
We do not care to know the population of London within 100, 
the expenditure of the Exchequer within i^ 1,000, or the expecta- 
tion of life within a day. It is often possible to attain practical 
accuracy within such limits. 

Definition of Error. — For purposes of measurement we 
may take the following definition : — The error in an estimate 
is the ratio of the difference between the estimate and the true 
value^ to the estimate ; the error is to be reckoned positive when 
the true value exceeds the estimate. 

Thus if the average weekly wage of agricultural labourers 
was in reality 14s., and we estimated it as 13s., our error would 

be y— 1^ = ~, or 7.7 per cent; if we had estimated it as iss., the 
13 13 

error would be ''^"'^ =— ^, or —6.6 per cent 

In algebraic notation, if u be the measurement of a quantity whose 
true value is «\ then is the error in the estimate, which we shall 

call e\ so that e = — '^ , and w^ = w (i + <?).♦ - is an appropriate measure 
u ' e 

of the accuracy or precision of an estimate, becoming infinite when the 

error is zero. 

In the nature of things, when we are dealing with errors, 
we do not know their magnitude ; the most we can know 
statement of is *their probable and possible extent We 
errors. might estimate, for instance, the percentage of 
unemployed in a certain year as 4.5, and add, from informa- 
tion in our possession (coming from a study of wage -bills 
or the reports of relief agencies), that we considered this to 
be within .5 of the fact ; we should then write the number 
4-5 ±.5, meaning that the error in the estimate as defined above 
was unlikely to be more than -^ = -» or n per cent, and the 

* This and most of the following algebraic paragraphs are from a paper 
on the Relations between the Accuracy of an Average and that of its 
Constituent Parts^ by the present author, in the Statistical foumal of 
December 1897. 
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precision was 9. In such a case we can also give definite limits. 
The percentage unemployed must lie between o and 100; and 
if we could actually enumerate i per cent of the working-class 
as out of work, and also 92 per cent, as in work, we should 
know that the number required was between i.o and 8.0 per 

cent, and the maximum error in our estimate, 4.5, was ^=-, or 

T^-'' 4.5 9» 

77 per cent. Even this is more precise than the original state- 
ment, " the percentage is 4.5, error unknown." By further investi- 
gation we might perhaps bring the limits of error nearer to 
each other, and decide that it was practically certain that the 
percentage required was between 4 and 5 ; then we ought to 
say " the number unemployed is .04 . . . of the working-class, the 
estimate being correct to the last figure given." This statement 
is of the same nature as, " The body weighs 1 5 lbs. 3 oz., correct 
to an ounce." 

While, on the one hand, it is clear that we cannot often 
obtain close definite limits to our errors, on the other we can 
very often see that some of the digits in a total are almost 
certainly right and others almost certainly wrong. Thus when 
we see in the Registrar-General's Report that the population of 
the United Kingdom in 1895 was 39,124496, the estimate being 
made from the census of 1 891, and the increase calculated on 
the basis of the increase since 1 88 1, we may be certain that 
the last two, or the last three, digits are no better than guess- 
work ; while the first two, or the first three, are correct Thus 
the statement should read: Population was 39.1 millions, or 
39,124,000+5,000, or whatever figures our examination of the 
varying rate of progress of the population led us to adopt, and 
this statement is actually more correct than the previous one. 

It is the custom in many classes of estimates to give the 
figures to the uttermost farthing. This is possibly right in 
Negieot official publications ; for the business oif the office 
ofminvtUB. is tQ receive and tabulate returns, stating how 
and whence they came, and leaving to the economist or the 
statistician the task of deciding the degree of accuracy per- 
taining to them. But in summary descriptions and accounts, 
and in scientific estimates, it is not merely unnecessary to give 
these last figures (both because they are not accurately known, 
and because they generally have no importance to the argument 
or significance to the reader), but it is positively inaccurate. 
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The easiest way to avoid the inaccuracy is simply to state totals 
in so many thousands (^.^., the earth is >,ooo miles in diameter), 
or if for any reason more exact measure be required (as when 
we are comparing the equatorial diameter with the smaller one 
through the poles), the scientific way is to give the number as 
far as it has been fairly calculated, and to indicate its precision. 



Rules for Computing the Effect of Errors. 

We may now give some rules connecting the errors of a 
complex estimate with those of the elements which form it. 

I. The error in an estimated sum is equal to the sum of the 
errors in the parts when each is multiplied by the ratio of the 
corresponding part to the sum. 

For if we estimate n quantities as «p Wg . . . u^ and their sum 

B-^i«— « ^s w, so that w = 2^1 + 2^2+ . . . «n. and the errors of the 
EiTor m sum. ' . , v & »• 

quantities are ^j, e^ . . , e^ and that of the sum is e ; 
then the true value of the sum is u (1 +^), and the true values of the 
parts are u^ (i +^1), ^^2 (^ +^2) • • • > so that — 

u{i+e) = u^{i+e^ + u^{i+e^ -^^ +, 

but « = «! +«2 + + ; 

hence, by subtraction, ue^u^e^ + ^^2 ^2 + + > 

and ^ = tf x^ +^ x?^ + +. 

^ u ^ u 

The formula is easily adapted to the case where some of the 
parts are subtractive. 

To take an arithmetical example, if two trade unions return 
respectively 555 and 45 members as out of work, while the true 

numbers are 565 and 50, so that the errors are jj^ and -, then 

the error in the sum is by the above rule — 

The greater error in the returns of the smaller union has little 
effect on the total. 

We can apply the rule to the important case where we 
can estimate a great part of a required total with considerable 
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accuracy, while we are ignorant of a smaller part Thus we 
may receive returns from several unions that 33,650 are out 
of work, and have reason to know that the error is not more 
than I per cent, while some smaller unions do not send any 
returns ; ^^e make an estimate for the smaller unions, say that 
1,000 of their members are unemployed, and suppose a very 
large error, say § or 6y per <:ent Then the error in the total is 
less than — 

J- of 33650^2 ^ i^ = 2.9 per cent, 
100 34650 3 34650 ^ *^ 

an error very much nearer that of the larger returns than that 
of the smaller. In the preceding sentence we say " less than," 
because we assume that we have taken an outside limit for the 
smaller error. 

II. The error in the arithmetic average of several estimates is 
the sum of the errors of these estimates ^ when each is multiplied by 
the ratio of the corresponding estimate to tltat of the sum of the 
estimates. 

For if //ij, ^2, . . . ///„ are // estimates of quantities whose true 
Error In values are m^ (i +tfj), m.^ (i +<?2)» . . • , the estimated and 
ATorage. \xMQ averages are respectively — 

Wi-fW2+ . . ■ OTn ^^^ Wi(i-fgi) + W2(i+g2)+ - ' ' +^^0(1^-0 
« n 

and the error in the average is — 

m^ (i '¥e^-\rm^{i+€^-{' + _ /w^ + w^ + 4- 
n__ n _ e^m^ 4- e.^m^ + + 

m^ 4- ^2 + + ;Wi4-z«2+ + 

n 

= ^ X - — 1- + ^o X -—2. X 
^ S.»i ^ S.« 

where S denotes the sum of all the w's. 

It is easily seen that no individual error can have much 
influence on the result, that the error in the average would be 
nearly of the same magnitude as one of the individual errors, if 
these were not very unequal, and all positive or all negative, and 
that if, as is generally the case, some are positive and some 
negative (a point we shall consider presently), the error would be 
considerably lessened. 
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III. The error in a weighted average is the sum of{\)an error 
due to errors in the quantities, similar to the error of an unweighted 
average, and (2) an error due to errors in the weights, which be- 
comes very small when the original quantities are nearly equal 

For if w^, o/g . . . be estimated weights applied to estimated 
Error in quantities m^, m^ , , . , and if the true values of the 
weighted weights are w^ (i+O* ^2 (^+^2) • • • > ^"<^ ^^ ^^e 
average. quantities m^ (i +^1), Wj (' +^2) • • • > ^^en the error is — 

[S{/« (f +tf). w (i+€)} Swo^n %mw 
S)a/(i+^)} """SarJ"^"Sw 

If we simplify this expression and neglect the products of two of the 
errors e and c (for if ^ ari<i c are each .1, their product is only .01), we 
obtain — 

Error in weighted average is — 

L 2 mW ^ 2 mw J L 2)a/ . 2wa/ ?*»" ^ quanUOes J 

If m^ — m^ is small, that is, if two of the original quantities 
are nearly equal, the first term in the second bracket becomes 
very small. Very great errors are required in the weights to 
make any appreciable error in the average. In fact, the errors 
in the quantities have so much more influence than the weights on 
the weighted average of not very unequal quantities, that errors in 
the weights can generally be neglected. Many numerical examples 
of this principle were given in the chapter on weighted averages. 

IV. The error in a product is approximately the sum of the 
errors in its factors, due regard being paid to sign. 

For '\i f, f^i ' . . ^ are the estimated factors, whose true values 
Bnoriii a.re/i (i +^i),/2 (i+e^), . . . , then the error of the product 

^*^™^' ■ ^i(i+^i)./2(i+^,) -frf^ . . . 

fvf^- • • ' 
= (1+^1). (1+^2) • • • -1=^1 + ^2+ +^n> if we neglect products of 
two or more ^'s. 

The ^'s are equally likely, d priori, to be positive or negative. 
If two e*s are of different signs, they tend to neutralise one 
another. The error in a product may be great if all the errors 
of the factors are of the same sign, even if they are small 
individually. 

For example, if we estimate that 100 men are earning on the 
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average 255. each, while in reality there are 105 men earning 
26s., the error in the estimated total sum earned is, by formula, 

100 ' 25 ^ 

If, with the same estimates, the real quantities had been 105 

and 24s., the error in the product would have been -^ ^ =.oi. 

V. The error in a ratio is approximately the difference between 
the errors in its two terms, due regard being had to sign. 

For if «p «2 be the estimated terms, whose true values are «i ( i + ^1) 
BRorinratia 2ind u^(i -¥6^^ then the error in the ratio is — 



«o 



second order in the ^s. 



= (^1-^2)0-^2 + ^24^2'+-) 
~^i-^2» ^^ ^^ neglect terms of the 



If the errors in the terms are both positive or both negative^ 
they tend to neutralise one another ; if they are also nearly equal, 
the error in the ratio becomes very small. 

We can apply Rule V. to the error in comparison of two 
averages of similar quantities estimated at different dates. 

With the same notation as under Rules II. and III., using w, «', 
^, €^ for the letters are one date, and »i^, w^, ^^, c^, for similar quantities 
at another date, then the error in the ratio of the simple average 
of m^y m,} ... to the simple average of w^, w., . . . is — 

^{-(fi'Oi-Hi:,)) 

Now if the quantities have not changed much during the period 

between two observations, the fraction — - will differ little from — — , 

and so on. 

Neglecting these differences in comparison with the quantities them- 
selves, a legitimate process when we are estimating the approximate 
influence of errors, we have — 

Error in the ratio of the simple averages = S < ^-^^(^i^ ~ ^1) \ 

If the two estimates have been made under nearly similar circum- 
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Stances, leading to similar chances of errors, e^ and e^ are likely to be 
not only of the same sign, but nearly equal. 

Write ^1, d^ , , . for {e^ - ^j), {e.^ - ^g) • • • » ^"^ ^^ \i2LWQ — 

Error = S.-{ d^ i -—^ j [, where the d's may be small. 

The corresponding analysis for the error in the ratio of two 
weighted averages is too complicated to be given here ; * but 
using the principle that errors in weight are less important than 
errors in quantity, which applies with slight modifications, we 
may use the formula just given for the first approximation to 
the error in the ratio of two weighted averages. This formula 
may be put in words : — 

VI. T/ie error in the ratio of two averages of similar series 
of quantities^ estimated at different dates, is approximately equal 
to the sum of the differences between the errors in the corre- 
sponding terms of the two series, each multiplied by the ratio of 
the latter of these corresponding terms to the sum. of all the terms 
at the latter date. 

This rule is so important that it will be worth while to 

'^^^ Illustrate it by an example, in which a further 
ff^mpftirlffoii of 
ayerages. quantity will be introduced. 

If in each of two years we are able to estimate, as in our example 
under Rule I., one part of ^ total more accurately than another part, we 

can use the following formulae : — 

First Year. Second Year. 

Estimated numbers or weights - w ; error e ; ui^ ; error ^ 

Estimated average income, or 

quantity - - - - m^ ; error e^ ; tn^ \ error e^ 
Estimated number, less accurately 

known - - - . rre^; error in r,/); r%^; error in r^,/)^ 
Estimated income - - - Wg i ^"^^ ^2 > ^^1 > ^^^or e^ 
fi and e^^ are, by hypothesis, less than e^ and ^2^- 

Error in average for first year — 
w (i+c) .>w^ (i ■\-e^) + r (i+p).w (i +€) .ffl^ (i -^-e^) 7 vm^ + rwm^ 
«/ (i +c) + r (i +/o) «^ (i +€) ~ w-hrw 

w + rw 

nt-t rWa ,r Ma — w, 

= ^1 1 +^2 ^ — +P • — ^ 

m^ + rm^ * m^ + rm^ ^ i+r Wj + rm^ 

if we neglect products of e and p. 

* It will be found in the article cited on p. 201 above ; a further approxi- 
mation also for the error in the ratio of simple averages is there given. 
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Here the errors, e^ and /o, connected with the less accurately known 
. part, are each 'multiplied by r, the ratio of the weight of that part to the 
weight of the better known part ; while e^^ the remaining error, is by 
hypothesis small. 

If for simplicity of argument we assume that the ratio of the unknown 
part to the whole (but not the error in estimating it) has remained 
unchanged, and also that the ratio of the estimated average incomes of 
the two parts has not altered, we have for the error in comparison — 

Thus in estimating the change in average wages of Scotch 
agricultural wages, we have figures similar in character to the 
following : — 

.867. Married Ploughmbn. '^- ^^^f^^'''''''- 

Estimated nxxmh^i - 1,000 Average income, £7,6 1,200 £^^ o o 

Supposed trtie nMmhcT 1,010 „ „ 35 1,220 48. o o 

Farm-Servants. 
Estimated number - 200 Average income — 240 

Money - ;{;2i ^27 5 fi 

Estimated value 
ofboard - 13 14 o o 

Total - £'^ £ai 5% 

Supposed true number 220 Total income - £y! 240 £^^ o o 

Here 7f'= 1,000, ^^ = 36, r = ^, W2'*34, w^ = 1,200, fn^^^% r= J, 

h 

Here it is supposed that we have overvalued the income of 
the married ploughmen, and undervalued that of the farm- 
servants in both cases. We suppose, as is the fact, that the 
value of the board and other perquisites of the farm-servants 
cannot be estimated with precision, and that the proportionate 
numbers in the two classes are not accurately known. 

Substituting in the above formula we find that the error in 
the estimated ratio of the average incomes of the two classes 
together in the two years is — 

— .006, due to errors in estimates of income of ploughmen. 
+ .008, „ „ „ servants. 

- .001, „ „ ratios of the numbers in the two 

classes. 
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Thus the last error, due to weights, is very small, and the 
second error, due to ignorance of the value of board, is reduced 
by the smallness of the number employed to a magnitude com- 
parable with the first. 

The whole error is, therefore, by formula +.001. Going to 
the actual figures, we find the estimated ratio of the second to 
the first to be 1.338 to i, and the supposed true ratio to be 

I-33S to I ; that is, the error is -i^ = .oo2. . . . 

1.338 

The difference between the two methods of calculation is 
then I in the third decimal place, which is accounted for by 
the neglect of the less important terms. 

It is to be noticed that the error in the ratio of two quantities 
is not the same as the error which we might be inclined to 
estimate, the error in the percentage increase. Thus in the case 
just taken, the estimated and true percentage increases are 33.8 
and 33.5, and the error in the percentage increase is .01. For 
accuracy in such calculations, then, we require the error found 
by formula, according to Rule VI., to be very small. 



Biassed and Unbiassed Errors. 

In the consideration of all errors in averaging or comparing, 
it is important to distinguish two classes of errors, those which 
Bnonar* ^^^ biassed and those which are unbiassed. The 
tiiaiaedorvn- difference can be made clear by illustrations. If a 
number of men are sent to investigate the condi- 
tion of an industry in different places, with a view of proving 
that wages are high, conditions of work healthy, and so on, they 
would probably, by esxamining only the best conducted works, 
and taking the wages only of the more skilled and regular work- 
men, produce an average for each town which- would be too high. 
On the other hand, if there was no brief to be held, but the 
investigation was impartial, the commissioners would in some 
towns take too high an average, in others too low, according to 
their idiosyncrasies and to circumstances. In the first case, the 
errors would be biassed, all in the same direction, all tending to 
increase the average, whose errors would be equal to the average 
error in the different towns. In the second case, the errors 
would be unbiassed, just as likely to be in excess or defect, and 

O 
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the more estimates made, the smaller would the resulting error 
be. The following figures would illustrate this : — 





Fact. 


Biassed 
Estimate. 


Unbiassed 
Estimate. 


Average Wages in District — a 


s. 

24 

23 
26 

27 
28 


s. 
25 
»5 
27 
28 

30 


s. 
24 
25 
25 
28 

27 


Averages - - - - 
Errors 


25.6 


5.?% 


25.8 

1% 



In measuring the distance of a bicycle ride on a mile-stoned 
road, it is found that the distances between successive milestones 
are not exact, but perhaps 100 to 200 yards out ; but it is nearly 
as likely that the errors will be in excess or defect, and the greater 
the distance gone the smaller will be the error, as defined. The 
errors are unbiassed. If, on the other hand, the bicyclist trusts 
to his cyclometer, he will have to deal with a biassed error, for the 
instrument will not fit the wheel exactly, but will always register 
say I, goo yards when the machine has gone a mile. This is a 
case where the bias can be measured and allowed for, whereas 
the unbiassed errors must be left to eliminate themselves. It is 
frequently the case that biassed errors are due to a wrongly gradu- 
ated instrument ; unbiassed to separate faulty measurements. 

In the census returns, the fact that many women return 

themselves as younger than their birth certificate states, causes 

a biassed error in the average age of the population ; the fact 

that people frequently return their ages at the nearest round 

number causes unbiassed error, and on the whole does not affect 

the average. It is not improbable that in the Wage Census of 

1886-89, there was a general tendency to obtain returns from the 

more lib^ally and better conducted establishments ; this causes 

a biassed error in the average obtained. With these illustrations 

we can pass on to another principle of ereat im- 
BeUtlT* import- ^ tt u- J r i-^i • -i. 

anoeofbusied portance. Unbiassed errors are of littie import- 
aBduiuaiMd ance compared with biassed errors in a simple 
estimate ; but biassed errors diminish when the 
ratio of two similar estimates is taken. 
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For in an average of several quantities, which have biassed errors 
iVv ^2 • • •) ^"^ unbiassed errors (e^, e.^ . . .)> ^^ ^^ easy to see from 

Rule II. that the resulting error may be written S ie^ ^-^) "*" ^ (^' ^^)' 

In the first term, the errors being unbiassed, many of them are 

positive, many of them negative, and they tend to neutralise one 

another ; in fact, if E is typical of the errors ^j, ^g . . . , then a first 

E* 
approximation to the error arising from them in the average is ~~J^' 

*sj ft 

Thus in the average of one hundred measurements, whose indi- 
vidual unbiassed errors are about — , the resulting error is 

— -^ Jioo=^ • There is no counterbalancing tendency, on 

lO ^ ICO ^ ^ 

the other hand, in the biassed errors; if each estimate was lo per 
cent in excess, then the average is also lo per cent, in excess. 
Ofeat effect of When aiming at accuracy our principle always is 
b^aned errors, ^q ^^^ ^^re of the pounds, and let the pence take 
care of themselves ; and it is quite futile to diminish the un- 
biassed errors, that is to increase the precision of our mea- 
surements, while a large biassed error runs through them all. 
If we do not know of the existence of biassed errors, which 
in reality pervade our estimates, there is no remedy ; if we 
do know of them, we are likely to obtain more accuracy by 
the most erroneous corrections for them than by neglecting 
them ; for when we make unbiassed corrections for our biassed 
errors, we reduce them to unbiassed errors, and then the more 
terms we include in our average the smaller is our resulting error. 
If, for instance, we find that the average weekly wage of agri- 
cultural labourers throughout the country is 13s., and by con- 
sidering the circumstances of the thousand returns which we may 
suppose led to this average we have reason to suppose that an 
error of is. would be typical of the unbiassed errors in them, 

then an error of -7==,+ that is only ^d., may be expected to 
Viooo ' 5 ^ 

result in the average. We have here a totally illusive accuracy ; 



* See article cited p. 201, supra^ and Part II., Sect. V., infra, 
t More correctly the error in the average is as likely as not to be as great 
as this, and very unlikely to be much greater. 
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the part of the labourer's income which we have not included, 
payments at haytime and harvest, facilities for piece-work, cheap 
rent for cottage and land and smaller perquisites, is not capable 
of exact calculation. If we omit all these entirely we shall leave 
an error in our average of 2s. or so ; but we make individual 
estimates of these additions, in all the thousand cases, though 
each estimate may be 2s. wrong, if there is no bias, the resulting 

error on the average may be expected to be j -— , that is only M. : 

our whole error is now not far from id., instead of 2s. In 
estimating the accuracy of published averages, these principles 
should be always borne in mind, and the possibility of biassed 
errors always considered. 

When we are dealing with the errors of a ratio the case is 
quite different. The error of a ratio is approximately equal to 
Aooonoyof the difference between the errors in its terms; if 
oompariions. ^^ ^ ^nd ^, e^ are the biassed and unbiassed errors 
in the terms, then by Rules I. and V. (V — ^)+(^^ — <?) is the 
error in the ratio. Now the unbiassed error {e^ — e) is likely to 
be of nearly the same magnitude as either e or e^* if, as in the 

2 

above example, e and e^ are unlikely to be much greater than r, 

(e^'-e) would be unlikely to be much greater than - . But 

(17I — 77), the result of the biassed errors, will, if the bias in both 
terms of the ratio was in the same sense (positive in both, or 
negative in both), be less than the original errors. If we have 
made the estimates of both terms on precisely similar methods, 
if we have asked the same questions of the same classes of 
persons, included and omitted the same details on both occasions, 
we shall have made the same errors of bias in both estimates. 
To return to our previous illustration, if we have made the 
glaring mistake of omitting everything except average weekly 
wages in the income of an agricultural labourer on both occa- 
sions, the only resulting error in the ratio will be that due to 
the change in these extra payments, which in short periods is 
likely to be small. Or, if we had taken summer wages as the 



• If E is the probable error in e or ^\ then E . a/2 is the probable error in 
their difference. See p. 305, infra. 
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average for the year in both cases, the error in the ratio will 

depend only on the change in the relation of summer wages to 

that average. Hence the error in the ratio of two estimates 

at different dates of a slowly changing quantity is, if the 

estimates are made on similar methods, often much smaller 

than the error in either estimate singly ; for the unbiassed error 

is little greater, and the more important biassed error is much 

diminished. We need not now know of the existence of the 

biassed errors; they will disappear of themselves. If we are. 

aware that there are biassed errors, and have any means of 

making fairly good estimates of them, it will be worth doing ; 

but we shall make a great mistake if we correct the bias in 

one year and leave it uncorrected in another. For purposes of 

comparison it is very seldom of much use and often of great 

disutility to make the later estimate more accurate than the 

Heed for imifor. ^^rfi^**- The error resulting from unbiassed errors 

mity in itrnoture can indeed be diminished a little,* but the error 
of serlEl retiirxui. y * r 1 i*i 

resulting from the more important biassed errors 

will only be increased. All Government officials and others who 
compile annual returns are in a dilemma : to make their annual 
statements accurate in themselves, they should always be strain- 
ing after improvements, they should always be watching for 
changes in the quantities measured and adapting their methods 
and tabulations to these changes ; but to make their annual 
returns comparable with each other, they should be absolutely 
conservative, and cling to any mistakes they or their pre- 
decessors have made in the past with all the strength red tape 
can give them, being careful, however, not to add to the mistakes 
or make new omissions. The dilemma can in some cases be 
avoided ; for when an improved method is introduced, the 
tabulation can sometimes be given for a few years both on 
the old and on the new plans ; then when the difference 
introduced by the change is known, the earlier figures can be 
brought to the greater precision of the later. Thus the Board 
of Trade has recently included in the tabulation of exports 
ships which, leaving our shores with merchandise, are them- 



* For if E and Ej be typical of the unbiassed errors at the two dates, 
then VEi^ + E'' is typical of the error in the ratio, which diminishes with 
either E or Ei. See p. 305, infra. 
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selves sold to a foreign owner ; and we have the following 
tabulation :* — 





1899. 


1898. 


Exports of Home Products 
(exclusive of ships sold to 
foreigners) 

Re-exports of Home and 
Colonial Merchandise - 

Total . 
Value of New Ships exported 

New total - 


;^255,465,ooo 
65,020,000 


;£233,359»ooo 
60,655,000 


;£320,485»ooo 
9,195,000 


;£294,o 14,000 
Not stated. 


^£329,680,000 





BeraltL 



Ignorance of slight alterations in the collection and tabulation 
of material has been the cause of many statistical mistakes. 

To sum up the chief results of this chapter : there are two 
processes which tend to accuracy — averagings whfch diminishes 
unbiassed errors; dind compartson^vfhich. diminishes 
biassed error. The errors in weights are seldom 
so important as the other errors which are present in estimates. 
Errors in a result cannot, of course, be calculated, but can be 
expressed in terms of errors in the items, from which it comes ; 
we cannot attain certainty, but we can indicate processes which 
diminish errors, and with the help of mathematics measure the 
extent of diminution. Initial errors are diminished most, when 
we calculate the ratios of weighted averages of similar and 
similarly estimated quantities. Index-numbers, which we dis- 
cuss in the next chapter, are examples of this class. 

The accuracy resulting from the process of sampling requires 
more mathematical treatment, and is dealt with in Part II., 
Section V. 



* Quoted from the Economist^ 17th February 1900, in the Statistical 
Jourtial of March 1900. 
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CHAPTER IX. 
INDEX-NUMBERS. 

The discussion of index-numbers supplies so good an illustra- 
tion of the principles laid down in the last chapter, and index- 
numbers are so important in themselves, that, though it is our 
intention to avoid special questions, it will be worth while to 
devote a short chapter to them. 

Index-numbers are used to measure the change in some 
quantity which we cannot observe directly, which we know to 
Fnnouon of . have a definite influence on many other quantities 
index-niimters. ^hjch we can SO observe, tending to increase all, 
or diminish all, while this influence is concealed by the action 
of many causes affecting the separate quantities in various ways. 
Thus, to take three of the quantities to which index-number3 
are applied, the change in the relation of the precious metals 
to the work to be done by them affects prices of all com- 
modities, but very many other causes are at work affecting the 
prices of separate groups of commodities ; there are general 
causes tending to raise the wage of a week's work of average 
skill, but this general increase is concealed by numberless minor 
causes affecting different grades of labour in different degrees ; 
the change in the consumption of goods by the^ working or other 
classes is a sufficiently definite quantity, but it can only be 
measured indirectly by observing the varying changes in the 
consumption of individual articles. 

The use of index-numbers is not, however, confined to these 
instances, but is nearly co-extensive with the field of statistics ; 
for we have limited the term statistics to the measurement of 
complex groups and their changes ; the object of statistics is to 
measure the action of the general laws which govern a hetero- 
geneous group, and the changes produced by general forces can 
be measured, as a rule, only by their effect in individual cases ; 
thus the method of index-numbers is at once applicable to the 
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disentanglement of that which is common to the whole group 
from those variations which are special to individual items. 

The general method of forming an index-number, e.g,^ of the 
fall of prices, is as follows : — We select commodities, whose prices 

Metbodof we can estimate accurately, and tabulate their prices 

formatioiL for a series of years. Choosing the prices in one 
year or the average for a sequence of years as a base, we express 
each series of prices as percentages, year by year, of their height 
in the chosen base-year, of their average in the chosen period. 
Then to find the index-number for any year in particular we 
take the average of the percentages in that year. 

The problem, of which index-numbers should give the 

numerical solution, may be compared to that presented to 

Astronomioai astronomers who estimate the motions of the sun 

"**'*«y* by observing those of the stars. As the sun and 
earth move towards some distant point, say in the constellation 
Hercules, the stars have an apparent motion, due to the unper- 
ceived motion of the observer; those in the region of space 
towards which he is travelling appear to be spreading out, as 
the distances separating them gradually subtend wider angles, 
while those in the region from which he is moving appear to 
close together, and those in directions perpendicular to the line 
of movement appear to move backward. Meanwhile all these 
stars have their proper motions, as rapid as that of the sun, 
but in as many different directions as there are stars. On 
the whole there is a trend in the directions determined by the 
sun's motion, but in individual cases this trend is entirely lost. 
So when a change in the currency has a general influence on 
prices, this influence is concealed by the movements due to 
causes affecting only some of the commodities. In both cases 
it is possible to find the general trend, if sufficient accurate 
observations are available. In both cases the problem is com- 
plicated by the possibility of links connecting the movements of 
groups of the stars or of the prices. 

It has sometimes been supposed that we can estimate the 
effects of general causes directly ; that we can, for instance, obtain 
indez-niimben ^n objective measurement of the change in the pur- 
andsampieB. chasing power of gold, by evaluating it at two dates 
in terms of all commodities purchased, weighted by the amount 
spent on each ; but it is better to neglect this method at once 
both as impracticable and as not answering the purpose of index- 
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numbers, for the effects of minor causes affecting separate com- 
modities would not then be necessarily separated from the main 
cause. 

Suppose that the changes in a group of quantities are deter- 
mined by one general force which acts on all in the same sense, 
that is, tends to increase all or decrease all, and by several other 
forces each of which acts on one or more of the quantities, and 
some of which tend to increase, others to decrease the quantities 
they affect ; then of the special forces, some will tend to increase, 
others to diminish the average, while the general force will 
have a cumulative effect entirely towards increasing, or entirely 
towards diminishing it. If the separate effects of the special 
forces are small compared with their number, they will tend to 
neutralise one another in their influence on the average; and 
the change. in the average will show the influence of the general 
cause only.* In the language of the last chapter, the special 
forces produce unbiassed changes, which are negligible in their 
effect on an average, in comparison with the biassed changes 
produced by the general force. To obtain this elimination it is 
necessary to take random samples, so that the law*s of probability 
may have free action ; and the two questions' to be discussed are 
the choice of samples, and the choice of weights to be applied to 
them. 

As we have already seen, the effect produced by varying the 

system of weights applied to so few as 30 or 40 numbers is 

ununportaiiae very slight, and the error resulting from errors in 

ofweietiitf. weighting is in many cases much smaller than 
the error resulting from faulty measurements of the quantities 
weighted. We shall presently show f that the precision of an 
average increases with the number of like quantities averaged. 
From these principles it is clear that it is more important to 
increase the number of our samples than to attempt accurate 
calculations of the proper weights to give them. 

The choice of samples is in practice very much limited, for 
in calculations extending over long periods we are dependent on 
the accidental preservation of records ; and when we have taken 



♦ This abbreviated statement should be criticised in the light of Part II., 
Sect, v., infra. See also Report of Committee on Variations in the Mone- 
tary Standard^ British Association, 1888-90. 

t See p. 305, infra. 
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into our reckoning all the measurements which can be accurately 
made, the number of samples barely comes up to the minimum 
necessary for the normal action of the laws of probability. 

There are many index-numbers of wholesale prices extant, 
some of which we may pass in review. The Board of Trade 
TheBoutiof publish the recorded quantity and value of goods 
Trade Index, imported and exported, and the average prices of 
these goods can be calculated. Those commodities are selected 
which occur in the returns for the whole period chosen. A 
particular year is chosen as base ; then the goods are valued 
in all other years separately at their prices in the base year; 
the total of these values in any year is the sum which the 
goods would have been worth if their prices had remained 
unchanged ; the ratio of this value to that actually recorded 
is the ratio of their average price in the base year to their 
average price in the other year selected (if the term average 
is used broadly), and if the first term of this ratio is equated 
to lOO, the second term is the index-number required for the 
year selected, expressed as a percentage of the number for the 
base year. It is at once evident that we are here dealing with 
weighted averages.^ 

Let /i, /g, /g . . . be the prices in the base year of units 
of the goods selected, and r^p^y r^pf;^ r^p^ . . . the prices in 

syitemi the year for which we require an index-number: 
of weight!, then r^, rg, rg . . . measure the changes of prices 
for the separate commodities, and these r's are the samples 
from which we are to deduce the general change of price. 
The weights used in the process described may be found 
thus : let b^^ b^, ^3 ... be the numbers of units of goods in 
the selected year ; then the total value in the selected year 
at the prices of that year is (^1^1/1+^2^2^+ • • •)> ^.nd at the 
prices of the base year is (^1/1 + ^2/2+ • • •) J the ratio is 
^brp : ^bpf and the index - number for the selected year is 

Here the weights applied to the r^s are the values which the 
corresponding goods in the selected year would have borne at 
the prices of the base year. It is clear that the selection of the 
standard year affects the weights, for any particular commodity 
can be given special weight by choosing as base a year in which 
its price is high, and much trouble has been spent in searching 
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for a " normal " year ; but though the weights of separate com- 
modities are affected, it does not follow that the average will 
be altered, and we should expect from the principle laid down 
above that the change would be very slight. In fact we have 
the following figures : — 



INDEX NUMBERS OF 1886 AND 1883 COMPARED.* 


Imports. 


Exports. 


Weights. 1 


Values at 
Pnces. 


Values at 

1883 

Prices. 


Values at 'values at 

1861 1 i88x 

Prices. 1 Prices. 

1 


Values at Values at 1 Values at 

1873 1883 1 1861 

Prices. Prices. ' Prices. 

1 


Values 
at 1S81 
Prices. 


1883 
1886 


100 
81.7 


100 
82.1 


100 
82.9 


100 
82.3 


100 100 ICO 

88i 88 - 87 


100 
89 



It is possible to produce figures which show a variation 
caused by a change of base year, but it is done by choosing 
samples which lend themselves to the special argument. 

Since so great an alteration in choice of weights makes so 
little difference, it is worth while to see if we need even keep 
the weight due to the quantities imported (the ^'s in the above 
formulae). The following table may be quoted f to show that 
these weights even have practically no influence : — 

Index- Numbers for 1895, when that of \ZZ\ is 100, obtained by 
Various Systems of Weighting, 





Ratios of Prices (ri, r, . . .) 


Reciprocal 
of A.M. 

of i, -^ , 


EcoHomisfs 
Figures. 


WeiRhted 
by Values 

of 1895 

Quantities 

at 1881 

Prices. 


Weighted 

by Declared 

Valuei in 

1881. 


Arithmetic 
Mean. 


Median. 


Geometric 
Mean. 


Imports 
Exports 


67i 
83 


69 

87 


734 
82 


72i 
81 


72i 
78i 


69 

75 


- 71 



* From the Economic Journal and the Statistical Journal^ both June 
1897. 

\ From the Economic Journal (with a correction in the statement of 
weights). 
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In the first column of figures the goods of 1895 (^^ b^, 6^- • - 
units at prices /i\ p^, p^ . . .) are valued at the prices of 
similar goods in 1881 (/i, p^, p^ . . .)• The ratio of their 

new to their old value ^^ ^^(''•v^) ^^ ^^ ^^^'^ ^^ ^^^ "^^ 
index - number to 100. In the next column the index- 
number is obtained by valuing the quantities of 1881 at the 
prices of 1895 ; then the ratio of the new value to the old 

i-^- (where a^, a^, , . are quantities in i88i) = lV.^^j is the 

ratio of the index-number of 1895 to 100 for 1881. In the 
next three columns the arithmetic mean, the median, and the 
geometric mean of the t^s are given. In the last column but one the 

arithmetic mean of —,--..., that is of the ratios of the 

prices of 1881 to 1895, is calculated, and the ratio of this 
mean to 100 equals the ratio of 100 to a new index-number, 
which corresponds to the former arithmetic mean with the 
years 1881 and 1895 interchanged. The figure in the last 
column is calculated from material given in the Economist;^ 
every year the imports and exports are valued at their prices 
in the previous year, and thus an annual ratio is given similar 
to that in the first column of figures in the table just given ; 
the number 100, taken for 1881, is multiplied by this annual 
ratio year by year till 1895, and the number 71 is the result. 
[Algebraically this index-number is — 

A more complete analysis of these figures, and an investiga- 
tion as to the causes of the divergence between the export 
indices 87 and 75, would show which of the methods should be 
adopted. Here we will be content with noticing that the 
unweighted average, 82, is very near the first weighted 
average, 83. 

Further methods of dealing with such weights are given on 
p. 226, under Retail Index-Numbers. 



♦ Where ri=V, r^^^, &c. 

A /a 

+ See, for instance, the quotation from the Economist in the Statistical 
Journal^ 1900, p. 129. 



Digitized by VjOOQIC 



1NDEX-NUMB£RS. ^23 

The advantage of index - numbers on the Board of Trade 
basis is that they measure approximately an objective quantity, 

Objeouve sind a result is obtained which can be stated in 

"••■^"^ terms which appeal to the ordinary man who is 
not a statistician: such as, "The imports of 1895 would have 
cost half as much again if their prices had been those of 1881 ; " 
but, as pointed out above, it does not follow that this index 
is the best mesisure of the less-definable quantity, " Fall in the 
price of imports," where we imagine a general cause affecting 
this class of commodities whose action is modified by other 
partial causes. 

A special advantage of the geometric mean* is that the 
results it gives are independent of the year chosen as base ; for 

Geometrio ^^ Pv A» • • • A and p^, p^, . . . A^ are the prices 



in two years, V/1/2 • • • A ''""JPiPi . . . A^ : : 100 : Ij 
the required index-number ; hence — 

A"^ 
'A 



13 = 100 



V /i A> A \^A' Pi 

T VA'^ A^' 

' V A' A' 



which would be the value obtained for the third year if only 
the second and third were considered. Considering the extra 
labour involved in calculating this mean, and the small advantage 
obtained by any alteration in the weighting, its use is not to 
be generally recommended.f 

Mr Sauerbeck and the Economist both avoid in part the 
difficulty of weighting the separate ratios by their relative im- 
ottior Index- portance in consumption, by selecting from those 
nimiiMrf. commodities whose prices are most accurately 
'determined more instances of such widely consumed articles 
as wheat than of less, important commodities such as linseed. 
Mr Sauerbeck has, in his annual articles in the Journal of the 
Royal Statistical Society^ verified the correspondence of the un- 
weighted average of his 56 ratios with the average of the same 
weighted on various principles. 

* Pointed out by Professor Edgeworth. 

t On this point, and on others in this chapter, see article Index- 
Numbers, in Palgrave's Dictionary of Political Economy, 
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While the choice of the special weights to be employed is, 
when the number of ratios taken is at all considerable, quite 
importaiioe of unimportant, the choice of the quantities dealt 
right ohoioe with has great effect on the result Thus import 
of lunpies. figures, relating to raw materials and the produce 
of other countries, do not lead to the same index-numbers as 
export figures dealing with the price of our own produce, 
though the tables just given show that they are little affected by 
weights ; and neither of these agree closely with Mr Sauerbeck's 
or the Econoniisfs numbers, and these again are not in complete 
agreement The samples on which these four sets of numbers 
are based are from different groups of commodities, and the 
numbers show that the same forces do not affect these groups 
in the same degree. When we have so multiplied our samples, 
that we can subdivide them without affecting the index-numbers 
deduced, we may expect our results to represent the required 
measurement* 

If we compare the Economist index-numbers with Sauer- 
beck's during the period 1860-70, we see that the former show 
Great advantage a very much greater increase during the cotton 
of the median, famine than the latter. An index-number which 
can be greatly disturbed by fluctuations, however violent, in 
only one group of commodities, is clearly wanting in some of 
the chief qualities of a general measure of price levels. A very 
simple means of avoiding this difficulty, and indeed all the 
intricacies of weighing, is to take the median of all the price 
ratios of a particular year as the index-number of that year. 
It is perhaps impossible to show theoretically that any other 
average satisfies the required conditions better than the median, 
and there can be no doubt that it is practically the easiest to 
calculate. 

If, on the other hand, paucity of data makes the inclusion of 
weights necessary, and the popular desire for concrete measure- ' 

Propoied ments makes a fine show of weighting expedient, 

standard. ^vg perhaps cannot do better than to adopt the 
standard proposed by the Committee of the British Association, 
already mentioned, for the construction of an index-number, 



♦ Mr Sauerbeck's numbers are to be found in annual articles by him in 
the Statistical Journal ; and a diagram showing them from 1820 is pub- 
lished by Effingham Wilson (is.). 
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which might be the basis of business transactions involving 
future payments. This standard is as follows : — 

Basis of Index- Number recommended by the Committee appointed by 
the Economic Section of the British Association^ 1888. 





Estimated 








Expenditure 


Hence 




Articles. 


per Annum 
on each. 
000.000's 
omitted. 


Weights 
assigned. 




Wheat - 


£(^ 


5 ^ 


Gazette average, English wheat. 


Barley - 


30 


5 U 


// » barley. 


Oats 


so 


// » oats. 


Potatoes, rice, &c. - 


SO 


S ) 


Av. import price, ' potatoes. 


Meat - 


lOO 


10 ' 




Market quotations, live meat, 
Smithfield. 










Fish 


20 


*i 


»20 


Board of Trade Returns; aver- 
age per cwt landed. 


Cheese, butter, milk 


60 


74 

J 




Cheese and butter, average im- 
port price. 


Sugar - 


30 


2n 


Av. import price, refined sugar. 


Tea ... 


20 


24 


» » » tea. 


Beer 


100 


9 >20 


*» export »» beer. 


Spirits - 


40 


2i(=^ 


n import 1 spirits. 


Wine 


10 


I 


» » » wine. 


Tobacco - 


10 


^K 


<» » » tobacco. 


Cotton - 


20 


11 .« 


m m m COtton. 


Wool - 


30 


m m m WOOl. 


Silk 


20 


m m raw silk. 


Leather - 


10 


m m H hides. 


Coal - - - 


100 


10 ) 


» export " coal. 


Iron 


50 


5 Lo 


Market price, Scotch pig-iron. 


Copper - 


25 


2i ^° 


Av. impfort price, copper ore. 
n n lead ore. 


Lead, zinc, tin 


25 


2|^ 


Timber - 


30 


3 > 


Average import price. 


Petroleum 


5 


I 




m mm 


Indigo - 


S 


I 


.10 


m m n 


Flax and linseed 


10 


3 


m mm 


Palm oil 


5 


I 




m mm 


Caoutchoux - 


S 


I ) 


m mm 



Since we can only obtain rough correspondence in dealing 
with wholesale prices, we cannot expect to be able to measure 
Retail prioe retail prices with any great precision. For we saw 
^^•^ in the preceding chapter that the error in an aver- 
age bears a definite relation to the errors in the items which 
compose it ; if the errors in the items are on the whole doubled, 
it is likely that the errors in the average and in the ratio of two 
averages will also be doubled, and we shall need four times * as 
many samples to restore the precision. Unfortunately the 



♦ See pr. 305, infra. 
P 
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material for computing a retail index-number is even more incom- 
plete than that for wholesale prices, and owing to the smaller 
number of articles that can be included, and the preponderance 
of such items as bread and rent, the question of weighting 
becomes of more importance. 

When we wish to construct an index-number to show the 
purchasing power of money to special classes, we must take 
speoiai into account some considerations which can be 
difflomtiefl. ignored when dealing with wholesale price num- 
bers. Different classes of persons at the same time, and the 
same classes at different times, spend their income in varying 
proportions on different objects. If we could collect enough 
sufficiently accurate samples, this fact would not matter so much ; 
but it would still be of some importance owing to the tendency 
to make increased purchases of cheapening commodities. As it 
is, it would be necessary to construct separate index-numbers for 
each class and each district. The difficulty of insufficient and 
inaccurate data cannot at present be overcome ; but as it is pos- 
sible that we may in the future get definite records of retail 
prices sufficiently numerous to make up for their want of pre- 
cision, we may glance at the other details of the problem. To 
form an index-number for a particular class of people, we need 
records of the method of expenditure of their income at all 
the dates in question, of sufficient numbers to obtain the slight 
precision which weighting needs. Then if we had fairly good 
Meihodi of rccords of retail prices several methods of weight- 
weighting. jpg g^j.g open to US,* all of which are likely to give 
nearly the same result. The necessity of weighting and the 
methods are best shown by a numerical illustration. 
Suppose the following records of expenditure: — 



First Year. s. d, 

6 quarterns bread at 6d. 3 o 
4 lbs. meat at yd. - 2 4 

\ lb. tea at 3s. - -16 



6 10 



Second Year. j. d. 

7 quarterns at 5d. - 211 

5 lbs. at 8d. - - 3 4 

i^ lbs. at IS. 4d. -20 



8 3 



The second year's budget at the first year's prices would cost 
I OS. iid. ; index-number of retail prices on this basis — 
100 X-?5i^ = 75.6 (^). 

los. lid. '-' ^ ^ 

* See article on Wages, Nominal and Real, in Palgrave's Dictionary of 
Political Economy^ pp. 640-641. 
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The first year's budget at the second year's prices would cost 
5s. lod. ; index-number on this basis 100 x ^^' ' ' =854 {b). 

The ratio of the costs of 6J quarterns, 4J lbs. meat, i lb. tea, 
the averages of the quantities in the two years, at the two sets 
of prices, is 8s. lojd. to 7s. oJd.= 100 : 73.2 (c). 

If we disregard the records of changing expenditure, we 
find that the unweighted average of the three ratios of prices is 
100 : 80.8 (d). 

If we suppose the second sum (8s. 3d.) to be spent on bread, 
meat, and tea in the same proportional parts as the first sum 
(6s. lod.), we have — 

fj^^ of 8s. 3d., z>., i^d. would have bought -^^ quarterns at 
the second price, which would have cost '^ ^ x 6d. at first price. 

Working out the other items in a similar way, we find that 
the second sum distributed in the same proportions as the first 
would have bought goods which would have cost los. lojd. at 
the first prices ; and the resulting index-number is — 

^^'^^' of 100 = 62.8(4 
los. lo^d. ^ ^ 

This reduction is due to the large expenditure on bread on 
this hypothesis, which can easily be shown to be an unreasonable 
one if we suppose the price of bread to be reduced to nothing, 
while the other prices rise ; then on this hypothesis the fall is 
infinite. 

Of the above numbers, (^), (<af), and {e) do not seem to rest on 
sound hypotheses ; (a) clearly overstates and {b) clearly under- 
states the fall ; and therefore some number between {a) and (b) 
is the number required. If {a) and (J?) lie close together there 
is no further difficulty ; if they differ by much they may be 
r^arded as inferior and superior limits of the index-number, 
which may be estimated as their arithmetic mean (80.5) as a first 
approximation. 

While it is useful to have a definite means of calculating 
these numbers to bring extravagant statements to a numerical 

Further test, there are two further considerations which 

difflouitias. hinder the complete solution of the problem. In 
all budgets rent is an important item, and there seems no 
prospect of obtaining any good estimate of the relation between 
increasing rent and improving accommodation, allowing for the 
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benefits of public expenditure paid by rates included in rent. 
Again, if we consider, not how money is spent, but how it might 
be spent, we should have to introduce a more general factor ; 
for the margin which remains when necessities are satisfied has 
a rapidly growing purchasing power, as the products of machinery 
increase in variety and diminish in price ; perhaps the calculated 
fall in wholesale prices forms a fair measure of this growth. 

Leaving this somewhat unfruitful topic, let us return for a 
moment to the measurement of a quantity more typical of 
index-numbers.* If we have to measure the action of a cause, 
mdez-nnmben which affects quantities which have no common 
of oonsnmptioiL measure, we are still able to apply index-numbers. 
A general increase has taken place in the consumption of 
imported goods, and if we can measure this increase indepen- 
dently of any change in price, we can use it as an argument to 
support the alleged increase in real wages. The only common 
measure of bread, currants, cheese, meat, &c., of practical value is 
their price, their weight being useless for the purpose. If the 
quantities consumed year by year of a number of such com- 
modities are written down, expressed as percentages of the con- 
sumption in any years (not necessarily the same), we have series 
of numbers which only need weighting to form the index- 
number required. We can in this case verify, that any logical 
choice of weights, based on their value or their assumed im- 
portance, or even a random system of weights, gives much the 
same index-number as the simple arithmetic averages; in fact, we 
have a sufficiently good group of samples to render us nearly 
independent of weights. When this is the case we can say with 
safety that the number required lies in the neighbourhood of the 
group given by the various systems of weights, and choose what 
appears the most logical system for the estimate we adopt. In 
the paper referred to, five different systems applied to only 
fourteen commodities give results for the increase of consump- 
tion all between 13.8 and 20.1 per cent, in the period 1873-96. 

The application of index-numbers to wage statistics does not 

involve any fresh principles. It is not permissible to ignore 

Wage index- weights in this case ; for an unweighted average 

nnmben. would not allow for the general tendency to in- 
crease numbers where wages are rising. There is great liability 

* The following illustration is based on Mr G. H. Wood's paper on 
Some Statistics of Working Class Problems^ cited above. 
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to "biassed" errors in separate averages; for wages for over- 
time, specially high piece-wages, wages of large uncombined 
classes of low-skilled or badly paid workpeople, may often be 
omitted in wage records. These biassed errors, however, tend to 
disappear in comparison ; and it may prove possible to construct 
a wage index-number of very fair precision. 
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CHAPTER X. 
INTERPOLATION. 

I. General. 

It is very often the case in practical statistics that we are not 
able to make serial estimates as frequent or descriptions of 
vooeMtty of groups as detailed, as is necessary for their use in 
istarpoiatioii. further investigations. Thus the population is 
only counted once in ten years ; but we need to bring monthly 
and annual accounts — births, deaths, trade returns, &c. — into 
close relation to the existing number of people, and estimates 
for the budget and the yield of taxes must be based on the 
assumed number of taxpayers for the current year ; it is 
therefore necessary to interpolate estimates for the number of 
the people in intercensal years. Again, interpolation is needed 
for the statement of the distribution of the population according 
to age, a tabulation which is necessary for actuarial work and 
for sociological purposes. The ages returned on the house- 
holder's schedule are nominally correct to the year, but in 
practice they are known to be inaccurate, tending. to group 
themselves in the neighbourhood of round numbers ; but the 
returns for such age periods at 35-45 years are more correct, since 
the persons who return themselves as 40 years old are probably 
within 5 years of that age. The original returns are so erroneous 
that they are not published at all, but the numbers are only 
given in the ten-yearly periods ; from the numbers so given, it 
is necessary to estimate the numbers for the individual years. 
Again, the compilers of the wage census of 1886-91 enumerate 
the numbers earning wages "of 15s. and under 20s.," "of 
20s. and under 25s.," and so on, but not the numbers in 
shilling limits. In problems relating to wages we often need 
more detail ; and when we are comparing these wages with a 
similar group in France, we must devise a scheme by which 
grades of 2 francs can be compared with grades of 5s., by a 
suitable system of interpolation. Such a necessity is very 
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common when we wish to compare groups, which are similar 
but tabulated on diverse systems. Thus, two countries conduct 
their census at different dates. In one country the age groups 
are of fifteen years, in another of ten ; in one, '* young persons " 
are those under 21 ; in another, those under 18. Occasional 
estimates seldom correspond in date ; wage statistics are found 
for 1840, 1850, and 1892 in France, and for 1866, 1885, 1886, 
and 1 89 1 in England. Similar differences are found when we 
are comparing county with county ; and a discussion of the 
method of determining averages in such a case will illustrate 
some of the elementary problems of interpolation. 

Suppose that the figures printed in Roman type in the 
momentary following table are accurate returns of the weekly 

•zampto. wages in three districts, and that we wish to find 
the average change in the three together. 



Years. 


i860. 


x86a. 


1864. 


x866. 


1870. 


1871. 


1875. 


,878. 


1880. 


x88z. 


District A' 

w B 
C 

Average 


s, d. 
12 6 

18 

10 


X. d. 
IS. 

ig 

It 


«. d, 
ts 

19 

II 


s. d. 
15 

20 

12 


*. d. 

IS 
20 

12 


*. d, 
14 6 

tg 6 

12 


t. d. 
18 

2t 

IS 


t. d. 
18 

21 

'S 


t. d. 

17 6 
20 6 
IS 


t.^ d, 
17 

20 

14 6 


.3 '6 


/J* 


ts 


IS s 


ts 8 


'S 4 


t8 


iS 


// 8 


17 2 



It is clear that there is something to be learnt about the 
general course of wages from the data, but the lessons are not 
obvious. The following figures, printed in the table in italics, 
are those which naturally suggest themselves. There is no sign 
in A of any change between 1862 and 1866, so we write ijs. for 
1864. Judging from B, the* figure for 1870 is not likely to have 
been lower than that for 1864, so we write ijs. for A in 187.0. 
A is now complete ; we notice that in A the first rise was com- 
plete by 1862, and assuming the same in B, we obtain igs.for 
1862. In C there is a rise between 1864 and 1866, while in A 
there is no change from 1866 to 1870 ; B will correspond if we 
write 20s, in 1866. If we write for B, ips. 6d. in 1871, 2/s. in 
1875, and 20s. 6d, in 1880, we shall have close correspondence 
with A from 1866 to 1 881. Similar reasons lead to the numbers 
interpolated for C. The unweighted average can then be cal- 
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culated year by year, which could not be done directly from the 
data. This average reflects all the changes in the original figures 
and gives no special predominance to any. It may be r^arded 
as the most probable series that can be based on the given 
information. 

We will now notice the assumptions tacitly made in pro- 
ceeding by this method. First, it has been assumed that there 
Afsomptioiu Are no sudden jumps, that such a figure as 20s. 
"•*•• for A 1864 is inadmissible ; this is only justifiable 
if we are acquainted with the general causes which influence 
the rate of wages, and know that there was no violent disturb- 
ance in the intermediate dates. We could not make this 
assumption as to wages in the cotton trade in the time of the 
American Civil Wars, nor can we make it over a long series 
of years. Secondly, it has been assumed that in the absence 
of evidence to the contrary the rise or fall has been uniform. 
Thus, in B 1878-81, the wage in 1880 is assumed to be inter- 
mediate between 1878 and 1881 ; if there had been no indica- 
tion from A that it was half-way between in point of wages, 
it might have been said that in point of time it was two-thirds 
of the way, and 20s. 8d. should be interpolated for 1879 and 
20s. 4d. for 1880, if it was worth while to depart from round 
numbers. Thirdly, it has been assumed that the. course of 
wages in .the three districts was similar. Thus in A there is 
a rise from 1860-62, but there is no further improvement at 
any rate before 1866; it is consequently assumed that the rise 
registered in B and C before 1864 actually took place before 
1862. Again, when considering the period 1870-75, we notice 
that in A there is a fall till 1871, and a sharp rise to 1875, and 
no change to 1878 ; in B, therefore, it is assumed that the wage 
of- 1875 is equal to that of 1878, and the fall in 1878 may be 
allowed because it increases the sharpness of the rise in 1871-75. 
In C it is doubtful whether the 12s. in 1871 should not rather 
be IIS. 6d. The reasons against are that a gain on a low wage 
IS often not so easily lost as a gain on a high one ; 6d. is a larger 
drop proportionately on 12s. than on 15s. ; that the rise of 3s. 6d. 
which would vthen be shown 1871-75 is a larger proportionate 
rise than in either A or B ; and that the existence of the fall in 
1870-71 depends only on the evidence of a fall between 1866-71. 
When the figures are few in number, it is necessary to examine 
them in this way to pick out the most probable ; and it is often 
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fairly easy to fill in the figures which satisfy all the existing 
evidence fairly closely. 

The question at once arises, What certainty have we that 
these quantities, by hypothesis unknown, are in reality anywhere 
near the figures which on the face are most probable ? 

In some cases of interpolation, dealt with presently, the 
answer can be given as a statement of mathematical proba- 
bility, such as : it is 2 to i against a divergence 
of 6d. from the assigned figure, 30 to i against 
one of IS., 1,000 to i against one of 2s. 6d., and so on ; but 
in the figures most often cropping up in investigations it is 
not possible to assign such a precise probability. There is 
one rough but useful way of testing the accuracy of such 
interpolation as in the case before us which can be explained 
by an example. Test how far we can throw out our calculated 
average for 1870, without violently infringing the common-sense 
of the question. Make A and C as large as possible in these 
dates ; we may perhaps suppose a rise of is. above 1866, seeing 
that there is one in B between 1864 and 1870. We can hardly 
suppose either that 1870 is as high as 1875-78, or that there is 
a great drop of as much as 2s. in the single year, if we are 
acquainted with the causes that determine the wages at those 
dates. Let the highest wage we can assign to A and (^ be 
1 6s. 6d. and T3S. 6d. respectively. Our average is then i6s. 8d. 
instead of 15s. 8d. Similarly, we might perhaps think that 
14s. and IIS. were the lowest possible in A and C in 1870; 
then the average would be iss. Assuming that we know enough 
about the general trend of events at these dates to assign limits 
in this way, we can say it appears improbable that the average 
wage in 1870 was less than 15s. or more than i6s, 8d., and that 
the evidence points to 15s. 8d. 

The accuracy of our interpolation then depends — (i) On 
knowledge of the possible fluctuations of the figures, to be 
obtained by a general inspection of the fluctuations at dates 
for which they are given ; (2) on knowledge of the course of 
the events with which the figures are connected. 

Nnmeiioai A second example of a similar kind* may be 

wcampie. given to illustrate the numerical calculation. 



♦ Taken from Agrict^ltural Wages in England^ in the StaiisHcal Journal^ 
December 1898, by the present author. 
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Northern Counties. Weekly Agricultural Wages in 

1867-69. 1869-70. 

*, d. X. d, 

Cheshire '3 ', '3 6 

Lancashire 150 15 o ■«■ 

West Riding of Yorkshire - - 14 6 16 5 — 

East „ ,. - - 14 6 i4 II 

North „ „ - - 14 6 IS 4 »- 

Durham 16 6 16 o — 

Northumberland - - - - 16 6 16 7^ 

Cumberland - - • - 14 4 14 9 

Westmoreland - • - - /f 7 16 i 

Roman figures given. Italic figures interpolated. 



The averages of the wages in the five districts for which 
data exist in both periods are ijs. 4,8d. in 1867-69 and 15s. io.4d. 
in 1869-70, that is in the ratio 33 :34. If we assume that the 
wages in the other counties have been influenced by similar 
causes and increased in the same ratio, we obtain the figures 
interpolated in the table. The unweighted averages for the 
northern counties are now 14s. iid. and 15s. 5d. in the two 
periods, instead of 15s. 3d. and 15s. 5d., the averages of the 
given numbers. For general comparison all over England 
between these two years we should have been obliged to neglect 
the missing counties in both years, which would have unfairly 
lowered the general average, since these counties have in recent 
times had wages above the English average though below that 
of the northern district. At the same time we should have 
unfairly raised the apparent average of the northern district. 
We should also have lost the probable figures for the special 
counties at the earlier date which are on a fairly safe basis ; 
for the wages in these counties of the Northern District remain 
in nearly the same order through the last fifty years. At the 
same time it is easily seen that these wages are not so accurately 
known as those not interpolated, and it is well to notice in 
arguments based on such figures, to what extent the interpolated 
figures are involved. 

A process very similar to that just employed is used in 
giving marks at school to students who are absent from a lesson ; 
attention is paid both to the particular student's general place 
in the class order, and to the average value of the marks obtained 
by the rest of the class in the lesson missed. 

Though th6 method be fairly complete it is very important 
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to notice that interpolated figures rest on quite a different class 
of evidence to those which are the result of direct 
distiiigiiiiiiing evidence. In some cases they may represent 
^ torw*** quantities which have no existence (as in the case 
of school marks) and which are only used for con- 
venience of calculation. In others they are simply figures 
adopted as those which in default of definite knowledge appear 
most probable. They must always be clearly indicated as inter- 
polations ; it is always well to state the method by which they 
are obtained, and any subsidiary information which may be re- 
garded as direct evidence of their accuracy, and if practicable 
they may be given not as exact, but as lying between certain 
limits ; thus the interpolated figures for Cheshire might be 
written 12s, 6d. to ijs. 6d., instead of 13s, id. 

Several different cases are met with in interpolation, ^ome of 
which are treated algebraically in the next section, while others 
can be illustrated at once by numerical examples. 

The Graphic Method. — If we know the values of quan- 
tities at isolated positions, such as the numbers of the population 

oraphio at the ages 25 to 35, 35 to 45, &c. ; the population in 

method. ,371^ 1 88 1, i89i,&c.; wages in i860, 1870, 1873, 
&c. ; the numbers whose wages are from iss. to 20s., 20s. to 2Ss., 
&c., we may represent the facts by such a diagram as — 






9 



Years i860 1866 1870 1877 1880 1884 

Suppose that we need the value of the quantity in 1875. ^f 
we were only given the two points C and D, the simplest 
hypothesis, and the one to be made in the absence of any 
evidence to the contrary, is that the quantity increased uniformly 
between C and D ; representing such an increase by the straight 
line C D, the height of the point x will represent the quantity 
in 1875. 
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If the point E is also given, the hypothesis represented by 
the straight lines C D, D E will not stand, for it assumes a sudden 
break in the regularity at the point D in 1877, for which there 
is no evidence. We must take into account all the points given, 
and through them all a line must be drawn whose curvature is 
as smooth as possible, for in the absence of evidence to the 
contrary, sudden changes in the quantities may be assumed not 
to exist. Such a curve can be constructed on mathematical 
principles, or may be drawn freehand ; if the latter, it will often 
be quite as near the facts as the argument will allow us to go. 

This method only applies to continuous quantities, such as 
numbers at different ages, population at different dates, earners 
at different wages in a very large group of wages. Thus for all 
England the average wage must change gradually, but the wage 
of the London builders changed suddenly as the result of 
strikes and arrangements at certain dates. In this case we 
must draw the figure to correspond as closely as possible to 
the evidence, such as — 



c 


f 


E 






LiHE OF > 


'EARS 






where A B represents a sudden rise ; B c a gradually accelerated 
increase due to improving trade, C D a slow falling off from 
the wage reached at C, and D E a determined and successful 
effort to recover the lost ground. 

Periodic Figures. — If we know the annual averages of 
figures which have a yearly period and a sufficient number of 
monthly averages to estimate the periodic fluctuations by the 
method described on pp. 182-7, we can interpolate figures for any 
month for which the returns are incomplete with fair accuracy. 
Thus if we are dealing with the numbers of unemployed as given 
in the Labour Gazette^ we find a periodicity which is not very 
strongly marked in all the months, but there is in general a fall 
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in the spring and a rise in the late autumn, and June is generally 
the minimum month. We can then make use of the small 
diagrams on p. 184, and, having marked in all the information 
we have, draw the waves on the rising, stationary, or descending 
line of averages, so that the fluctuating lines shall pass through 
all the given points. We can obtain an idea of the accuracy 
of the resulting figures by noticing the general characteristics 
of the given figures ; we find that the percentage unemployed 
has never changed more than two units in one month, that 
there are no fluctuations which have lasted less than three or 
four months, and that the percentages have never been below 
I or above 10. Finally, we can look at the trade history of 
particular dates, and in the light we thus obtain reject any 
improbable figures. 

Use of Subsidiary Curves. — If we are able, by the 
methods described in Chapter VII., Sect. III., or Sect V., 
to find a close connection between two series, we can use the 
more complete of them to assist the interpolation pf any missing 
figures in the other. We must first investigate carefully the close- 
ness and nature of correspondence at the dates for which we 
have complete figures in both series. Then we can draw dia- 
grams, similar to those facing p. 175, one of the lines being 
incomplete. Then completing the broken line, so as to bring 
it into as close resemblance with the completed line as the given 
points allow, we shall obtain the most probable values for the 
missing figures. The accuracy of the result can be tested as 
in the previous case. This method may reasonably be used 
in interpolating figures for the yield from one source of revenue 
by means of the yield from another ; for the value of exports 
from that of imports ; for the marriage rate from foreign trade ; 
for the wages in one district from those in another; for the 
number of unemployed from the changes in consumption of 
foods ; for changes in parts of the population, when we know the 
changes in the whole, and for many other series. 

General. — Series of figures may be classed in three groups — 

(i) Periodic ; (2) symptomatic, where there is a general tendency 

oanerai oiaati- towards increase (as in serial wage statistics) or 

floauonofioriei. decrease (as in the English birth rate in recent 

decades) ; (3) those which have no period and no symptom, but 

only apparently random fluctuations. 

To interpolate figures in series of the third group it is neces- 
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sary to obtain a measure of the fluctuations, by the theorems 
of Part II., and then we can assign the mathematical proba- 
bility of the various numbers possible. In series of the second 
group, we must pay attention in addition to the symptomatic 
tendency. The necessity for interpolation of this kind does 
not, however, arise frequently, so we will not offer-"detailed 
illustrations of it. 
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Section 2.— Algebraic Treatment. 

The problem of interpolation to which most attention has been 
given may be stated as follows : — ^When one quantity is subject 
to continuous regular change, and a second quantity changes in 
connection with it, and we know or can estimate directly only 
some discontinuous values of this second quantity, it is required 
to find the probable values of the second quantity which corre- 
spond to given values of the first : for instance, given the expec- 
tation of life at the ages 15, 20, 25, &c., it is required to find it for 
intermediate ages ; given the population of the country in 1871, 
1 88 1, 1 89 1, 1 90 1, find it at intermediate dates. The only per- 
missible assumptions are that the quantity changes continuously, 
that is with no breaks at any figure, and that the rate of change 
of the quantity is also continuous, that is that the line represent- 
ing its value is not angular, but smooth. This problem differs 
from those just discussed, in that there is likely to be a law 
binding such figures together, whereas in the former cases the 
consecution was apparently random. 

It is necessary to divide the problem into two classes : — 

A. Where the given values may be assumed to be accurate ; 

B. Where the given values are liable to correction. 

A. Some preliminary algebra is necessary; it is derived 
principally from Boole's Finite Differences and De Morgan's 
Differential Calculus^ to which authorities readers may be referred 
for more detailed treatment. 

I. Let ^ be a continuous function of .r, and let'^o. ^i».y2 • • • 
be any values of ^. 

Let A^i, Aji, Agi . . . be written for y^-y^, y^-yv y^-y^ • • • 

V, A,2, A^^ . . . „ A,i - A,i, L,] - A,i, Aji^^ A,i, . . . 

4,8, A,8, A,3 . . . „ A,2-A,2, A./- A,2, A32-A./, . . . 

and so on. ~ - 

A^i, Aj\ ... are called differences of the ist order ; 

Ao^ Aj2j ... „ „ 2nd order, and so on. 

It is easy to show that A^^ ^y^ -2y^ \-y^, A^^ ^y^ -2^2 +.yi» &c. 

^ =>8 - ly^ + 3.^1 -J'o, Ai« =^4 - 3^3 + ly^ -^j, &c 
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and generally that — 

rir-i) 



^J^yr-r.y,^, + j^ -A-a - + to r+i terms - - (a) 

and ^'^yr+^-r^yr+M-z + ^ \ ^^ •^r+.-.- + to rTi terms - (P) 

the coefficients being those of the binomial expansion. Equations (a) 
and (j8) are easily proved by induction. 

We can also express values of y in terms of yo and the differences ; 
y^ ^y^ + A^i, y^ ^y^ f A^i =^, + 2 A} + A,2, ^^3 ^y^ + 3. A^i + 3V + ^a^ • • • 
and generally — 

j,=jKo + r.V + ^l(^^A2+ +to7TT terms (7) 

AJ« A^, + r.Aj^^ + '^^^— ^\ Aj+2 + "^^^^X- ^-^■^5"''... + to TV\ terms (6) 
1.2 i««*3 

J'r+s-J's + r.A.i + -^— -^A.a+ >- -f-\— ^A,H...+tor+iterms(€) 

which formulae can be proved by induction, and in fact (y) is a special 
case both of (5) and (c). 

2. If we assume that^ can be expanded in ascending powers 
of .r, and is a rational function of the «* order, we have — 
y = a^A-a^x->ra^^\ . . . a„^°, ------ (f) 

where a^ dfp . . . a^ are constants. 

Suppose now that y^^ y^^ y^ - - ' y^ ^^^ ^^^ values of y which cor- 
respond to values of x^ increasing in arithmetic progression, viz., 
^o> -^0 + ^, x^-\-2hj . . . ^o + «^; then, on this assumption, we have 
« + I equations to determine the constants in equation (f). 

We can, however, write equation (f) in terms of the ys and x^s 
without evaluating the constants ; for 

4- . . . to « +' I terms - - - {l) 

is an equation of the «* degree in x^ and reduces to the identity (y) 
above, if we substitute {x^ + rh) and^„ where r is an integer not greater 
than «, for x^and j'. 

Hence equation (rj) is of the same order and satisfied by the same 



« + I pairs of values as equation (f), and is therefore identical with it. 

Again, if ^o» ^i> • - • y^ are values of y corresponding to any values 
of x, viz., Xo, jCj, . . . Xj^ it can be shown that the equation 

{X'-X^){X-X^ , . . JX-X^) {X-X^) {X-'X^),,,{X-'X^ 
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is equival ent to (f), for it is of the same degree in jc, and is satisfied by 
the same « -h i pairs of values of (x, y)^ viz., {x^^ y^ (jCp ^i) . . . {x^ y^)- 

3. Still assuming that an equation of the form of (f) satisfies 
the conditions, we can at once interpolate any values needed. 

Thus, if we are given that j'o* yv ^2» • • -yn-t are values corresponding 
to a: = I, 2, 3 ... « respectively, we can find y^ where s is fractional, 
by putting *o=i> ^4=1, ^=n-^in equation (17). We obtain — 

y, =y, + s.^} + \^ ^ A,2 + s ' :^ — -^ V + to « terms (0 

We can easily obtain similar formulae for any other intervals. 

4. Notice that, if ^o, ^1, . . . j^n correspond to values of x 
(jTo, ^i+A, &c.) in arithmetic progression — 

yo = ao-¥a^Xo + a^^^+ . . . + a^^^ 
and ^1 = ^0 + ^1 (ato + A) + . . . + a„{^o + ^)" from (f); 
.-. V = fljA + . . . + a^{(Xo + hy - x/} 

= a^h + ... + a„ . nhx^^^ + terms of lower degree in x^ 
an equation of the n-i^^ degree only. 

.Continuing this process, we obtain — 

A/ = an. A".«!, and there are no higher differences. 

Also, since (^o, x^ {y^, x^-\-h) . . . (^n+» x^-^n-\-i.h\^em2L curve of 
the n^ degree, we have from equation (a) — 

A..-(«+i). A+^'^. ^._.- («-^')-^- (>>-') . ^._. 

1.2 1*2.3 

+ - to « + 2 terms = AJ+' = ^ - - - - (f) 

5. If for any purpose we need to evaluate the constants in 
equation (f ), we can abbreviate the solution, as follows, if the x's 
are in arithmetic progression. 

Given five pairs of values, we have — 

^2 = <io + <Ji (^„ + 2A) + ^12(^0 + 2^)^ + a^{x^-\-2Kf + a4(Xo + 2^)* 
jf3 = dfo + «i(^o+3^) + a^[x^ + zhf + a8(jfp + 3A)8 + ^4(^0 + 3^)* 
>'4 = ^o + «iK + 4^) + ^2 (^0 + 4^)2 + a^{x^-{-4hY + a4(:ro + 4^)* 

As in the last paragraph C^J^^a^,h^,^\, .-. «4 = ^t;j4 - - " (^) 

It is easily seen that A^^ is independent of Oo, a^ and ^29 ^^^ that 
Ao8 = tf8 {K + 3>^)' - 3 K + 2>4)8 + 3 (^^ + A)8 _ ^^8} ^. 
«4< (^o + zhf - 3 (^o + 2>4)* + 3(^0 + hy -x^^}^ 6^dfj + a^{ 2^x^ + 36A*} 

whence «3=^- V(g, + ^) ; 
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w6ile V = 2h\a^ + a^ {6h^x^ + 6h^) + a^ (\2lC-x^ + 2/^^x^ + i4>4*), which 
gives a^, a^ and tfj can then be found from the first two equations. The 
points of inflexion on the curve, y = a^-\-ayX'\-a^^-k'a^'\-a^a^^ are 
determined by the equation — 

= — ^ = 2^2 + 6^8^^ + 1 2^4^ 

and the sign of — ^, i.tf. of a^ + 4^4^, decides the nature of the change 

of curvature. 

This method is employed on page 254 infra, 

6. In evaluating the constants it will be found that the 
following Identities are sometimes useful : — 



«'-"Ci « - I +*C2 « - 2 - + = the coefficient of s^ in the expansion 
of r! {<?"-*Ci ^" + *C2 ^" - } 



t,e. 



Coefficient is ^, when r= i, 2, 3 . . . 5 - i. 
„ r!, when r = j. 



(,.i)... 



whenr = j+i. - - (/i) 



7. It is necessary to express the diflferential coefficients of 
y with regard to x in equation (f) in terms of the differences, 
and conversely. Now from a comparison of (f) and (17) we 
have, when Xo is zero — 

.-. Ag = (V-iA,2 + ^A,8-i4,4 + ) +^( ). 

Writing y —f (x) for equation (f), the equation just written gives, 
when x=x^^o^ 

Applying the same process again and again, and remembering that 
A^2 bears the same relation to A^^ as A^^ bears to y^, and so on, we 
obtain, omitting the suffixes — 

//-/"(^) = (A-iA2+lA3-)n^, - - - (.) 

where the A's are to be treated as ordinary algebraic quantities, till the 
exponent is removed. 

Thus hJ'^(p)^^^{l-\^^{■\^^-f 

= A2-A8+|JA4- 

>%y8(^)^A»-5A4+ &c. 
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It is to be noticed in particular that each derived function 
depends only on differences of as high an order as itself. 

Again, by Taylor's Theorem, 

.-. ^i -yi -yo = ¥K^o) + ^'/^K) + 
JK, =/K + 2 A) =/K) + 2 VH^o) + ^/^K) + 

and, using equations (a) and (/*), or otherwise, generally 

^l-ATM + Zi'^'i } (o) 

8. We may now consider the assumptions made when we 
took (f ) to express the relation between y and x, 

liy and x are connected by any functional law, that is if y is 
determinate for all given values of x^ without which assumption 
interpolation is meaningless, then^ can be expressed as a function 
oi x; let^=/(;r), then, by Maclaurin's theorem — 

J>'=/W =/W + X,P (O) + ^/S (0) + ^/8 {O) + + 

2 3 

to an infinite number of terms. 

U/^'^'(o) and following coefficients are very small, and x is 
never large, the terms from the n + 2^ onwards become negligible 
in comparison with earlier terms, so that the first n+i terms 
determine the value of ^ approximately. Now by the equations 
(v) and (<?),/""^* is small when A°+^, A"+«, ... are small, and vice 
versa. Hence we have the following general statement: any 
functional relation between y and x reduces to the parabolic 
equation of the «*** degree (f), if the differences of orders higher 
than the ti^ vanish, and if these differences do not vanish but are 
small, equation (f) is still an approximate expression for the 
relation. 

Now if the line drawn through the given points is to have 
continuous and slowly changing curvature, it is easily verified 
that the second differences for points near together are not large, 
for a rapid change in the rate of increase of the ordinate means 
a rapid change of curvature ; and if we construct a second curve 
with the same abscissae: and the first differences as ordinates, 
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small third differences will indicate absence of rapid change in 
the first, and so on ; but beyond this point it is not easy to 
see the connection between the hypothesis underlying inter- 
polation and the diminution of successive differences. The 
converse, however, is clearer ; if in any series of figures it is 
found experimentally that the successive differences tend to 
disappear, then any curve which passes through the points is 
expressed approximately by the parabolic equation. De Morgan 
states this conclusion thus : — " If we take n points near each 
other, and having their abscissae in arithmetic progression, with a 
small or at least not very large common difference, and their ordi- 
nates not very unequal . . . the parabola of the «-!**" order will 
very nearly coincide with any regular curve of the same general 
appearance, at least between the same points." Boole's explana- 
tion is : — " It is customary to assume for, the general expression 
of the values under consideration a rational and integral function 
of Xy and to determine the constants by the given conditions. 
This assumption rests upon the supposition (a supposition, how- 
ever, actually verified in the case of all tabulated functions *) that 
the successive orders of differences rapidly diminish." 

Since, from equation {p\ when k is small, the successive 
differences for any curve diminish as their order becomes higher, 
it is a legitimate process to build up a series of values of any 
function on the hypothesis that the higher differences vanish. 

If a freehand curve is drawn so as to pass through the chosen 
fixed points, and to have curvature which changes as slowly as 
possible, a line will be obtained which lies very near that given 
by equation (f). Such a line would be similar to the track of a 
bicyclist who was riding so as to pass over several marks, or to 
just avoid several obstacles. 

9. It is clear from the above analysis that we can make a 
smooth continuous curve pass through any number of points we 
please ; for with the parabolic equation (f) there are never any 

sudden jumps in the values of ^, ^ or ^■^, as ;r changes con- 
tinuously ; and we can obtain as many linear equations (which 
have always real values) as there are constants, simply by taking 
n in the original equation to be the number of fixed points. 

e dx, not statistical 
approximations. 
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If we have, let us say, 10 points, as — 




and wish to find a point on a fixed vertical line between F and G, we 
can either take only F and G into consideration, and, joining them 
by a straight line, obtain the point x^ ; or considering E, F, and G, 
or F, G, and H, draw parabolas and obtain X2 or x^ ; or considering 
E, F, G, and H, draw a parabola of the third order, which would 
have a point of inflexion near F ; this would be approximately the 
path a bicyclist might follow if he had to start from E, and ride 
to a near point H, passing close to F and G. If we now include 
D and K (if our bicyclist has to start from D, pass E, F, G, and H, 
and reach K) we shall modify the curvature throughout ; and as 
we include more and more points shall continue to affect slightly 
the path F G. If the inclusion of the nearer points tends to 
make the line F G approximate more and more closely to a 
final position, While the further inclusion of the more distant 
points throws it further away, we may conclude that the positions 
of these further points are not governed by the same numerical 
conditions as the nearer one. Thus in a " table of survivals " 
the figures for ages under 5 years are not distributed in accord- 
ance with the curve determined by the figures for higher ages ; 
in a table showing wages, it may be seen that those of highly 
paid workmen are not governed by the same causes as those 
lower in the scale. On the other hand, the number in each 
census IS dependent on all the previous numbers for more than 
one generation. In interpolating for the population of 1876 we 
shall obtain different figures according as we include 1851, '61, 
*7i, '81, '91 only, or 1901 as well ; and this is not surprising, for 
a mistake made in 1876 may not come to light till we have 
watched the growth of the population for twenty-five years. It 
is clear that the points far from the period in which the inter- 
polation is to be done cannot be allowed so much influence as 
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those nearer, and it appears experimentally that this condition 
is fulfilled in the method discussed ; also, in series (rf) the suc- 
cessive coefficients begin to diminish with the r^ term where 
;r<Xo+(2r— 3)A, that is with the coefficient of the first differ- 
ence when X is between x^ and Xo+A. It may be noticed that 
the wanderings of the curve are limited by the condition that a 
curve of the «— i*** order cannot have more than « — 3 points of 

inflexion, for ^ has no term of a higher degree than jr""l 

in the above illustration the intermediate points from F to 
G might be found from the five points D, E, F, G, H, or from 
E, F, G, H, K. These two curves may be welded together be- 
tween F and G. The points near F are more accurately deter- 
mined by the first, of which it is the middle ; those near G by the 
second. The welding line should touch the first at F, the second 
at G. This is conveniently done by the use of the sine curve. 
This method is employed, I believe, at the Registrar-General's 
office. 

It cannot be said that the present theory of statistical inter- 
polation rests on an altogether satisfactory basis.* The prin- 
ciples which govern it are not well defined, and the mathematical 
analysis of the methods, by which the principles should be 
brought into relation with the facts, is incomplete. Yet it is 
perhaps unnecessary to labour after more refined methods, for 
interpolation cannot be precise unless we actually know the 
algebraic expression of the laws which govern the figures, and 
the method here discussed is found to satisfy the conditions 
empirically, while further refinements could only introduce slight 
modifications. 

'"' This remark does not apply to the interpolation in evaluating mathe- 
matical functions. 
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lo. Examples showing the Numerical Use of the 
Formulae. — (i.) Given the number of wage-earners earning 
sums in 5s. groups, to estimate the number earning as much as 
24s. and not so much as 25s. 



; 


•Numbers 

per 1,000 

Wage- 

Earners 

(Adult males) 


! 
Differences. 


1st. 


2nd. 


3rd. 


4th. 


"S 20s. 

Earning as much S 25s. 

as I OS. ^ 30s. 

f 35s. 

\^ 40S. 


39 
296^ 

599 
804 
918 
966 


257 

303 

205 

T14 

48 


-98 
-91 
-66 


- U4, 

-7 

25 





Neglect the increasing differences arising from the number earning 
less than 15 s. 

Using formula (?;), :ro=2o (shillings), >4 = 5, ^^0=296, ^^1 = 303, 
A,2=-98,V=7,A,4=,8. 

At 25s., _y = 599, from above table. 

At 24s., :r=24, ^'=296 + - of 303 + r'— of (-98) + 

5 10 15 ; 5 10 15 .20 

= 296 + 242.4 + 7.84 + .224 + .3168 = 547 (nearly). 

The required number is therefore 599 - 547 = 52. 

Again at 23s., a: = jt:o+ 3, ^ = 489, and the number earning as much 
as 23s. and not so much as 24s. is 58. 

(2.) To make an estimate for the value of imports in the year 
181 3, the records for which were destroyed by fire. 

Given value of imports in — 

1810 - - - j^39,202,ooo - - - ^r 

181 1 - - - 26,510,000 - - - y^j. 

181 2 - - - 26,163,000 - - - ^8- 

1813 ... ... . - . ^^. 

18 14 - - - 33»755»ooo ' ' - y^' 

1815 - - - 32,987,000 . - . y^. 

1816 - . - 27,431,000 - - ' yr 



• General Report on Wages. 
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From formulas (k), using y^ and y^ only — 
From formulae (k), using j'j and^g as well — 

y%^ y^- A (a+^'s) + ^y^ = ^,>'4 =30,029- 

From formulae (#c), using ^'j and ^7 as well — 

^7+^1-6 (jg + j'g) + 15 {y^'\ry^ - 20^4 = 0, y^ = 30,421. 

Here the first and second values are very near together, 
while the third differs ; hence we adopt ;f 30,000,000 as the value 
required. {Cf. a similar example in Boole's chapter.) 

(3.) In Mr Booth's Life and Labour of the People^ e,g.. Vol. 
v., p. 46, a series of very useful diagrams is given showing the 
age distribution of various classes. The figures he uses are as 
follows : — 

Proportion Average at 

occupied per each year of 

10,000 of total age between 

Ages. aged 10-80. given limits. 

10-15 years - - - - i93-5 3^-7 

15-20 „ . - . - 880 176 

20-25 „ - - - - 933 188.6 

25-35 » .... 1,636 163.6 

35-45 „ .... 1,201 120. 1 

45-55 »» - - - - 830 83 

55-65 ,» - - - - 434 43.4 

65-85 „ ... - 192.5 12.8 

His diagram is drawn from the last column, the numbers in 
which form the ordinates for the middle of the corresponding 
age periods. The points so obtained are joined by straight lines. 
This method is sufficiently accurate for his purpose, but it will 
afford an interesting example of interpolation if we obtain some 
of the figures for intermediate years more closfely. 

Numbers up to 
Mean Age. corresponding limits. 

•^1 = "i - - - - - -^'i = ^93-5 

^2 = i7i ^^2 = I073-5 

-^8 = 22^ ^8 = 2006.5 

•*4 = 30 ^^4 = 3642*5 

•^6 = 40 76 = 4843-5 

•^6 = 50 - - ■ - - -^^6 = 5673-5 

JP7 = 60 y*j = 6107.5 

^8 = 72i J'S = 6300 
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Since the ^s are not in arithmetic progression, we must 
use Lagrange's formula (6). 

To find the ordinate corresponding to the age 35, for example, 
we will include the five values of 7 from^, to^^. 

Then ^ = io73.SX(.5j(.,,jj(^^^j,^.3,jj+2006.5Xj(_yjj^_^yjjj,^yjj 

+ 3042.5 X J2J.7 J, ( - 10) ( - 20) + 4»43-S X 22i.(i7«.io.(-io) 
+ S673.5X 32j.27j.20.10 

=4412. 

Mr Booth's method gives 4243 for the same position. 

(4,) We can now determine the median and the mode more 
accurately than before. We will use the figures already em- 
ployed in Chapter IV., which may be retabulated thus : — 



Earning 
more than 


Numbers. 




Correspond- 
ing Abscissae. 


$4.75 


9 










19 


4-25 


13 










17 


3-75 


109 










IS 


3-25 


363 










13 


2-75 


561 


506 


464 

327 

175 

-"55 






II 


2.25 


1,067 


970 


-137 


— te 


9 


1-75 

'•25 

•75 


2,037 

3.334 
4,806 


1,297 

1,472 

317 


-V,^ -"" 


7 
5 
3 


•25 


5.123 






] 


I 



Unit of abscissa, $.25. 

To find the median use the five points whose abscissae are 11, 

7» 5» 3- 
Equation (f) gives — 



561 = ao + aj.ii + a^.ii^ + a^.ii 



8 + ^4.1 1* 



1067 = Co + «r9 + «2.9^ + «8-9^ + ^4-9^- 
2037 = flo ^- «i.7 + «2-7^ + ^z'l^ + ^i'7* 
3334 = «o + «r5 + «2-5^ + «8-5* + «4-5* 
4806 = Co + ar3 + «2-3^ + «8-3' + «4-3* 
Using equations (X), we have — 



= -'5 , _ 5 



24 X 2* 



128' 



, since A = -2^ ^= -1$; 



j = -i5I +is('^i_ -_L^\since^p==iiandA;=== 137, 
6x8 \6xi6 4x8/ 



197 
48' 
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464=«8fl2 + i^(a4xii-6x8) - -^(48 x ii^- 24X 8 x 11 + f4x 16) 
48 ' 128 

Equations (jj) could have been used with advantage if the difference 
between successive abscissae had been unity. 
a^ is found from the equation — 

-1472 = 2dfi + 16^2 + 98^3 + 544^4 

^1 657^ 

and finally «„ =» 6972^*^. 

The median is then found from the equation — 

2561J =3 a^ + a^x + a^ + a^ + a^a^ 
Solving by Horner's method, we find x = 6.142; and, therefore, the 
median is at $1,536. 

Second method : — 

Suppose ;ir expressed as a function of j/* and apply Lagrange's 
formula (^) suitably altered. 

g)56i^ - 3334) (2564 " go37) (2561^ - 1067) (2561^ - 561 ) 
' ^- (4806 - 3334) (4806 - 2037) (4806 - 1067) (4806 - 561) 

M3334-48o6)( )( ) ' 

whence x = 6.237 ; that is the median is at $1.56. 

This method saves the solution of a biquadratic, and with 
small numbers would need less numerical work than the first * 
method. 

Third method : — 

Use formula (17) to obtain the necessary equation. 

Thus 2s6iA=;^ = 56i+^?^lii of 5064. (fHilH^LlS) of 464 
— 2 — 2 X — 4 

+ <^-"><^-9><^-7)of(-i37) 

-2 X -4X -6 

-2X-4X-6X-8 
This reduces to the same equation as in the first method — 

2564 = 6972— 8-657^x-336^^+-78~ -^8 

The quartiles, deciles, and percentiles can be found by similar 
methods. 

♦ Compare Edgeworth's Representation of Statistics by Mathematical 
Formula^ Statistical J oumaly 1898, p. 699. 
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For the mode we must take in the last number, 5133, and 
recalculate a^, a^, a^ for the five highest values o{ y^ and then 
solve the quadratic given in paragraph 5, viz. — 

giving the constants their new values. 

Hence JT = 8.2 or 4.40; -^ is positive and .*. -Az22 a maximum, 
do^ dx 

when ^ = 4.40. The mode is then at $t.io. 

The mode can of course be determined less accurately by 
taking 4 or 3 given points instead of 5, or for greater accuracy 
more can sometimes be used. 

Another mode may be found between $2.75 and $3.75 from 
the five highest abscissae. This proves to be at $3.20. 

This method is applicable to such problems as the determina- 
tion of the date at which the population, the marriage, birth, and 
death rates, &c., increased most rapidly ; at what age the chance 
of death increases most, &c.* 

B. The second division of the problem of interpolation is 
when the original returns have to be corrected, e^,>^ the deter- 
mination of the distribution by age from the census returns. 

We have now the problem of drawing a smooth line in the 
neighbourhood of a great number of points, but not necessarily 
through any of them. The assumption is that the returns are 
insufficient in number or deficient in accuracy, and that they 
indicate a regular distribution which it is required to represent 

1. One method is to assume that the averages over fairly 
large groups are accurate, and to these averages to apply any of 
the methods discussed under group A. 

2. A second method has been used in the section in which 
various curves were smoothed {vide supra. Chapter VII.). This 
may be restated as follows : — Take successive groups of 2, or 3, 
or 4 .... 10 points, beginning again and again at the ordinates 
for each of the given abscissae. Find the centres of gravity of 
each group ; that is, erect an ordinate equal to the average 
of the ordinates of a group at the point half-way between the 
ends of the abscissae of the outside ordinates of the group. 
Draw a line through the points so obtained. It will be found 
that this line satisfies all the conditions laid down. An 
example of this method is given in the diagram facing p. 151. 

* Cf, Edgeworth, in Statistical Journal, 1899, p. 381, and the references 
there given. 
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3. In another method* the original figures are smoothed till 
the differences of the fourth or fifth or higher orders vanish; and 
then the ordinary formulae of interpolation are applied. . 

Thus in example i, on page 250, rewrite the table thus : — 



Wages. 


Smoothed 
Numbers. 


Corrected Differences. 


Up to 20s. 

., 30s. 
I, 35s. 
„ 40s. 


296 

599 + ^ 
8o4 4-fl + ^ 
918 
966 


1st. 
303 + fl 
205+^ 
114-a-d 
48 


2nd. 

-98-^ + ^ 
- 9 1 - a - 2^ 
-66 + a + d 


3rd. 
25 + 2a4-3<^ 



If we put ^= 2^, a= - 16, the third differences vanish, and we have 
AJ=287, A2=-79|, Al = Al = o; when ^=25, ;/ = 583, and when 

x=24,y=296^i of 287 - ^ of ( - 79I) = 532 ; 
so that the number earning as much as 24s. and not so much as 
25s. is now found to be 51, instead of 52. 

The corrections may be applied to any of the original figures. 

We need to solve only one more equation to complete our table 
from 20s. to 30s. 

When X = 23, j' = 296 + f of 287 + /^ of 79I. The difference be- 
tween this and the value ofy, when x = 24, is ^ of 287 - ^j of 79! = 54.2. 

We have therefore the following table, where the figures in 
italics have already been calculated, while the others are added 
on the assumption that the third differences are zero. 



Wages. 


Numbers. 




Up to 20s. 

„ 2IS. 
„ 22s. 
» 23s. 
„ 24s. 
„ 2SS. 
„ 26s. 
» 27s. 
„ 28s. 
„ 29s. 
>. 30s. 


360 
420 
478 

631 
677 
719 
757 
792 


ISt. 

63.8 
60.6 

57-4 

iU 

44.6 

41.4 
38.2 

35 


and. 

3.2 
3-2 
3-2 
3.2 
3.2 

3-2 

3-2 

3-2 
3-2 


3rd. 












If we had taken the second differences more exactly, we 
* Suggested to me by Mr W. F. Sheppard. 
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should have obtained 804 + a + 6 = 790J for the last figure 
as in the previous table. 

This method of writing down many figures when the signifi- 
cant differences have been found can be very generally applied 
in Group A as well as here. 

4. Another method, involving higher mathematics, would be 
discussed more suitably after the section devoted to the law of 
error ; a brief explanation with a useful formula may, however, 
be offered here. 

Suppose we have five consecutive points (— 2,_y2), (—1,^1), 

A parabola of the fourth order could be drawn through these 
five points, but would have two points of inflexion. A great 
number of parabolas of the third order can be drawn near all 
the points, having no points of inflexion, and satisfying all the 
ordinary conditions of interpolation. 

Borrowing a principle from the method of least squares,* if 
the coefficients of the parabola j'=a+&r+ «;•+ 4^ are chosen 
so as to make the quantity 

2(a +6x+cx^+dx* -y) 

(where the summation extends over the five pairs of values of 
X and^') a minimum, the parabola so determined will be the 
best for the purpose. 

For the necessary mathematical analysis, Professor Darwin's 
paper On Fallible MeasureSy^ from wfiich this method is taken, 
should be consulted. 

The following equation is obtained — 
a^y-^ ^ X AJ, where AJ is the difference of the fourth order for 
the ys. 

Now replace the point (0, y) by the intersection of its 
ordinate with the parabola, that is by (^, a\ where a has the 
value just given, that is, diminish^ by the quantity yV^ 

Repeat the same process for each point on the original line, 
taking it as the middle of a group of 5, and a smooth curve 
lying very near all the original points is obtained. 



♦ Sec Merriman's Method of Least Squares ^ Chap, III, 
t See Phil. Mag. and Journal^ July 1877. 
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Thus we may smooth line C in diagram facing p. 164. 



Imported 

Wheat per head 

of the 

Population. 


Differences. 


Smoothed Figures. 


lbs. 

1890 226 

1 89 1 244 

1892 245 

1893 248 

1894 256 

1895 285 

1896 257 

1897 228 

1898 238 


18 

I 

3 

8 

29 

-28 

-29 

+ 10 


-17 

2 

5 
21 

-57 
- I 

39 


19 

3 

16 

-78 
56 
40 


-16 

13 

-94 

134 

-16 


245+ A of 16=2461 

248 -A of 13 = 247 

256 + A of 94=264 

285 -A of 134= 263^ 

257+ A of 16 = 258 J 



5. In many series of observations it is found that the num- 
bers very nearly satisfy some algebraic formulae,* such as the 
binomial expansion, the geometric progression, the law of error, 
or some specially chosen expression. In such a case the con- 
stants of the equation chosen are computed by methods similar 
to that of the last paragraph, and the original observations are 
replaced by the ordinates of the curve thus determined. Prof. 
Pareto has found an equation which fits the data of the distribu- 
tion of incomes.*!" Modern mathematical statistics deals very 
frequently in such formulae. Here we will briefly describe one 
which has very practical utility, namely, Makeham's formula 
for the life table. J If 4 is the number who survive to the age x 
out of a given generation, then the formula l^ — ks^ ^sY^ where 
ky s,g^ c are selected constants, fits the records from the ages of 
20 upwards with such exactness, that the formula is used for 
practical actuarial calculations. The formula is not quite arbi- 
trary, but can be obtained from the hypotheses described in the 
following paragraph. 

Let the quantity T '"^'7 ' = — j^ = i^^^dx. Then f^, is called the 

" force of mortality," and represents th^ ratio of the number of persons 
dying in a short interval to the total number alive at the beginning 
of the interval. 

♦ See Edgeworth, ibid,, p. 671. 
t See Statistical Journal^ 1896, p. 533. 
X Sec Institute of Actuaries Text-Book, Part II. 
R 
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This force is supposed to consist of two parts, one constant and = A ; 
and the other such that the ratio of the increase of the force to the force 
is constant, that is, that the force continually increases in a geometric 
progression. For the latter part (jjt,\) 

log/*'x = Dx+E 

fji\ = ^D«+E ^ B ^^ ^j^g^ B ^ ^E and log .c= D. 

Then /x, = A + B^. 

This equation represents the hypothesis that the chance of death 
consists of one part which is constant for all ages, and another which 
is due to the power of resisting death diminishing continuously with 
age in a constant ratio. 

We have - 7^ = /jlJx = A + Bc'dx 

- log /, = A:«r + k^c"^ + >^2 where k^^ k^ are constants. 

/, = k,s^. (gf , where - A = log .^ 

- k^ = log ^ 

- *2 = log e^ 

For further information on the subject of interpolation, the reader is referred 
to Dr Farr^s Life Table (No. 3), 1864, Boole's Finite Differences^ Text-Book of 
Institute of Actuaries^ Part II., p. 420 seq,^ Rice's Theory and Practice of Inter- 
potation^ 1899, Merrifield On Quadratures and Interpolation (British Associa- 
tion Report, 1880), Chauvenefs Spherical and Practical Astronomy {Chz,^ II.), 
Woodhouse in the Assurance Magazine (Vols. XL, XII.), Professor J. D. 
Everett's Papers (published or forthcoming) On the Algebra of Difference 
Tables (Quarterly Journal of Mathematics, No. 124, 1900), On a CentrcU- 
difference Interpolation Formula (British Association Report, 1900), and in the 
Journal of the Institute of Actuaries, January 1901, and Mr W. F. Sheppard's 
Papers On Central Difference Formulce (Proceedings of the London Mathe- 
matical Society, Vol. XXXL, Nos. 707-710), and On the Use of Auxiliary 
Curves in Statistics of Continuous Variation (Statistical Journal, September 
1900). In these other references will be found. Part of the foregoing 
chapter might be simplified by the use of " central differences," but in so 
short an introduction to the subject it seemed best to keep to the more 
familiar method. 
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PART 11. 

APPLICATION OF. THE THEORY OF PROBABILITY 
TO STATISTICS. 



SECTION I. 

Introductory. 

The arguments on which the theory of algebraic probability- 
depends are not difficult to follow, and are in fact grounded 
Object on every -day experience ; the development of 

of Part n. calculations also is often little more than straight- 
forward arithmetic ; and without using any elaborate mathe- 
matical theories we can examine the nature and deduce the 
equation of the curve of error, which, though it is the foundation 
of modern mathematical statistics, is only a reasonable summary 
of common experience. 

It is not proposed here to go beyond the more elementary 
and common applications of the law of error ; the more 
advanced treatment tends to deal more with theory and less 
with practical applications, and is most suitably studied in 
the original treatises scattered through the journals of various 
learned societies. The present object is to endeavour to make 
clear the groundwork of the subject, so that it will be the 
easier for students to follow modern writers on statistics ; the 
mathematicians who are opening up new ground in this direc- 
tion naturally cannot stop in each article they write to establish 
the elementary theorems which are already common property, 
and so it is often not easy for readers, unfamiliar with these 
elements, to find any satisfactory discussion or proof of the 
preliminary forjnulae or theorems, since they are not contained 
in any text-book devoted to the subject. It is this lack of 

Digitized by V^jQOQlC 



262 ELEMENTS OF STATISTICS. 

a preliminary text -book that it is wished to supply in this 
and the following sections. 

The treatment is not intended to be original, and is, it is 
hoped, not inconsistent with Professor Edgeworth's published 
treatises,* since the greater part of the mathematics employed 
is gleaned from his essays, and the earlier authorities to whom 
he makes reference. The exact form of the proofs employed, 
and the particular ways in which the formulae are used, are not 
in all cases to be found elsewhere ; and any fault which may be 
found with the arguments or application of formulae must attach 
to the present writer. To avoid mere repetition of what is better 
said elsewhere, and not to cumber the ground with well-known 
elementary formulae, the reader is assumed to be acquainted 
with Dr Venn's Logic of Chance^ and Whitworth's Choice and 
Chance^ or with the chapter on Probability to be found in 
ordinary school algebras. It is hoped that the following pages, 
however, will be for the most part independent of proofs or 
formulae of which the explanation is not furnished. 

To the statistician of a generation ago, to the so-called 
practical men of the present day, and perhaps to some political 

N«edfor economists, it would seem absurd and unnecessary 

appuo^oaof to apply these tedious arguments and complicated 

formulae to the study of mere figures, which at 

first sight appear subject to the ordinary rules of arithmetic ; 

but it will be found as we proceed that we are able by their use 

to solve problems and investigate causal relations which, though 

apparently simple, must entirely baffle direct attempts to obtain 

artMt from *" ^^y solution. The necessity of some application 

th« definitton of the rules of probability becomes evident from 

the very definition of the science of statistics.t 

Statistics deals with great numbers, the numbers of the items 

which compose some part of the economic or social body as 

a whole. It does not deal with a single homogeneous mass 

but with a complex body composed of multitudinous units 

differing in form and action one from the other; and it is 

with the complex not with the units that it is concerned. 

Just as in the mechanics of rigid bodies it is necessary to 

♦ See Lit and Phil, Magazine y passim; Statistical Journal^ passim; 
Report of British Association on Methods of Ascertaining Variations in the 
Monetary Standard^ 1888, and others. 

t See supra^ p. 7. 
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make some hypothesis as to the laws which hold their con- 
stituent molecules in place, before any general problems relating 
to their motion as a whole can be attacked, and in the kinetic 
theory of gases a generalized theorem of the motion of the 
separate molecules is employed, so in statistics we must obtain 
some generalizing principle as to the relation of unit to unit 
before we can study the phenomena manifested by the body. 
The economist and the politician, when investigating the 
effect of a given force, are as a rule concerned with its effect 
on the whole mass, not on the individuals in particular.* 
For illustration, we may take one of the numerical totals, relating 
to a nation, that remains nearly stationary year by year ; say 
the number of marriages yearly in a population of ten millions. 
It is on the total that we trace the effects of a change in 
our marriage laws. If we regarded only a single family, or a 
village or small town, we should not find any constancy ; the 
marriage rate would be changing continually with the personnel 
and age of the small community, and we could not trace with 
certainty the effect of any external cause. But add family to 
family, village to village, and district to district ; the individual 
peculiarities of the parts are rapidly lost in the total ; in a 
large communily the same number are of marriageable age 
year by year, the same distribution by age and sex recurs 
continuously ; if undisturbed by external influences, the same 
marriage rate will be found over a long period. Each couple 
is influenced by many circumstances before finally deciding to 
marry; there are very many causes, each of limited effect, 
which influence the question in different localities, such as an 
exodus of young men from one district, commercial depres- 
sion in another, a new demand for labour in a third ; but 
when many districts are taken together these small disturbances 
counterbalance one another. To produce a change in the 
rate, the action of a cause is necessary which affects many 
districts in the same way. Here is to be found the assumption 
that underlies all statistical investigation, viz., that many inde- 
pendent disturbing causes of small individual effect neutralise 
one another in the mass, f 

It is a matter of common experience that great numbers 

* See Miirs Lo^c, Book III., Chap. 23, and Book VI., Chap. 3. 
t Compare the title of Lexis' treatise, viz., Zur Tkeorie der Massen- 
erscheinungen in der memchlichen Gesellschaft. 
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and averages drawn from them are nearly stationary. By 

Theoomiiioniiro- searching for the common properties contained in 

pertyofgrMt these numbers, we shall find the clue to this con- 

"'™ stancy. The following are among the numbers 

which do not undergo rapid change : birth, marriage, and death 
rates in districts of, say over a million inhabitants ; death rates 
according to age or disease over larger areas ; the numbers 
of the inhabitants of a great kingdom, even when subdivided 
by age and sex, the numbers of paupers, criminals, lunatics, 
afflicted ; the consumption of certain commodities, the total 
income, the average wage, and total imports and exports 
(though here the constancy is not so apparent). These are 
all totals of many small items, the existence of each of which is 
determined independently and apparently by chance. Another 
class is to be found in meteorological measurements, such as 
annual rainfall, mean temperature, and mean barometric height, 
where the average or total is again drawn from the combination 
of many small independent variations or contributions. An allied 
class is found in such physical measurements as average height 
and weight. 

It is not so easy to exemplify large numbers which are not 
constant. The total revenue which varies with each change 
in impost is an example. The number of a conscript army 
changes with the law controlling it, the number of volunteers 
with improved conditions of service, the area of the British 
Empire with each territorial extension, the volume of trade 
with a commercial inflation, the death rate with an epidemic. 
All these are changes where one cause has influenced many 
items at once in the same direction ; but even here the 
underlying constancy arising from the multitudinous small 
independent causes is apparent. 

This constancy, marvellous as it actually is, is generally 
accepted as a matter of course; and it is not the regularity 
variatton of great but the occasional deflections which are the sub- 

Biimbers. jg^t of comment. For instance, the death rate in 
London will hardly change, except regularly with the seasons, 
week by week through a series of years ; and when an increase 
of s per 1,000 occurs in some week, the newspapers write of 
an influenza epidemic. The mean annual rainfall will for a long 
period be near its average ; then a decrease of 5 inches excites 
remarks on a permanent change of climate. It is because this 
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regularity has become a matter of common experience that so 
h'ttle attention is generally given to it. A cursory inspection, 
however, of the records for a period of weeks or years of any of 
these numbers will show that the constancy is not absolute ; 
that each rate varies through a great or small percentage, and, 
except that the variation seldom passes certain limits, without 
any apparent law. Thence at once rises the question, how are 
we to determine whether a given deviation is due to some 
general cause, such as an epidemic, a change of climate, or a 
new law, or is natural to the phenomena ? 

'This question can only be answered by an appeal to the 
laws of probability. To take a numerical instance : suppose we 
' ninrtnudoniiy ^^^ dealing with i,ooo men, each fifty years old, 

tbe unomiai how many should we expect to die in the year? 

*^'*'"^°"* Fall back on former experience, and find what 
has been the average death rate under similar circumstances ; 
this rate gives the number to be expected d prioriy a great 
divergence appears from past experience to be improbable, 
and the greater the divergence the greater the improbability; 
an exact repetition of the average itself appears to be im- 
probable ; the question is, what divergence is to be expected ? 
This is insoluble directly, but we can frame a hypothesis which 
throws light on the problem. Suppose the ascertained death 
rate to be 50 (per j,ooo), and further suppose that the chance of 

death for each individual is -^=— .' Then it is easily deter- 

1000 20 ^ 

mined by the rules of algebraic probability that the successive 
terms in the expansion by the binomial theorem of (^+^) 
represent respectively the chances that exactly o, i, 2, 3 . . . 
of the persons die ; (^£9 j ^°°° is the chance that none die, 

( 20 ) (20) ^^ ^^ chance that one assigned individual only dies, 

/^9\ 999 / ^ \ 
1000 X ( ^j ( ^ 1 that only one unassigned individual dies, 

and so on. The death of exactly 50 is more probable than any 
other number, 49 very nearly as probable, 51 next It is very 
soon apparent when the successive terms are calculated, that 
any great divergence from 50 is very improbable. 

This conception, that all the men start with the same chance 
of death, or, in a more developed form, that their chances of 
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death are grouped about an average ^, satisfies the d priori 

Prooanjutiflod Conditions of the problem, and clearly leads to 
tiyitirMuits. results which correspond roughly at any rate 
with experience; but the justice of the conception cannot be 
deduced d priori^ for it is universally the case with any 
hypothesis as to probability, that conformity with experience 
is the only justification for the hypothesis. If it is true, we 
should find that when the records of many such generations of 
I, GOO men were examined, the divergences from the average 
were grouped in the way shown by the algebraic calculation. 
The records for this particular examination are not extant; 
but in the sequel some records will be given where experience 
marches with theory, and references will be given to books where 
others may be found ; though it may be said at once that the 
agreement is not perfect, and that there are indications that the 
law is not so simple as that already suggested. 

Consider the supposition that the chance of a death within a 

year is — . When we say that the chance of an event is ^, we 

Thomeaidiigof ^^^^ ^^.t if the circumstances connected with it 
a nnmerioai recurred again and again, the event would occur on 
an average once for each twenty such recurrences.* 
Thus if a die with six regular faces is thrown again and again, 
the different faces tend to come uppermost with equal frequency. 
As a matter of fact, each of the six would probably not be found 
once in each six throws, nor exactly two of each in twelve throws ; 
but, in the long run, it is a matter of experience that the numbers 
of times each of the six faces come uppermost tend to be equal. 
Suppose an experiment, for the success of which the chance is 

4, to be performed again and again. In 200 attempts from 8 

to 12 successes may be obtained ; in 2,000 the proportion of 

successes to attempts will probably be nearer — , say 94 to 106 

successes; in 2,000,000 yet nearer. Now suppose 1,000 experi- 
ments to be made ; as we have seen, exactly 50 successes are 
not to be expected: but let 1,000 after 1,000 be tried; some- 
times more, sometimes less than 50 successes will be obtained ; 
and as the series continues the general average will tend nearer 
and nearer to 50. 

* Logic of Chance^ 3rd Edition, pp. 4, 5. 
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Still postponing the examination of the exact grouping of 

such numbers about their average, let us examine further the 

Beiation to law nature of the argument. Suppose we are given a 

of error. series of large numbers or rates, measuring similar 
quantities year after year ; we shall find, when they are grouped 
according to their distance from their average, that the fur- 
ther from the average the fewer are the instances. In most 
cases we cannot work backwards to a number of individuals, 
each of whom has an equal chance of furnishing an event, but 
we can examine this grouping, notice how far the numbers are 
from their average, and so on ; in many cases we shall find that 
these divergences conform to a definite law, the law of error, 
which is obeyed by all great numbers coming from series of 
experiments as just described. The point to notice specially 
here is, that correspondence to this regular law of divergence is 
natural, and it is for discrepancies that we need seek a reason. 
It is improbable, it is impossible, that great numbers should 
remain absolutely constant ; from the nature of the case there 
must be variation ; in very many cases the natural variation, the 
variation to be expected d prion] is that in accordance with the 
law of error. This is so with those great numbers which are the 
sum of very many items, in favour of the existence of each of 
which there is a definite chance, or, more generally, the existence 
of each of which may be influenced by many independent causes 
each of limited effect. 

A slight confusion may arise from the use of the words 
cause and chance in this statement ; this can be removed by 

GAvie and eliminating the word chance. We say a thing 

***"°^ happens by chance, when its occurrence is influ- 
enced by many independent causes whose separate effects 
we cannot trace, as when we draw a card from a thoroughly 
shuffled pack. Now if we consider a man's death from 
the point of view of an insurance office, we regard the 
man as of normal health and constitution, and liable to all 
the latent diseases, the accidents, and the epidemics, from 
which experience shows men suffer ; we cannot trace the inci- 
pient development of a disease, nor foretell the chain of events 
which lead to an accident. We then speak simply of the 
chance of death within a certain period, and say experience 
shows it to be (^,^.) i, and, regarding the peculiarities of a 
particular man as unknown, we say that his chance of death is 
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-. Generalizing, any group of men, each of, the given age and 

in the given circumstances, is composed of individuals for each 

of whom the chance of death is — . Now, go behind the idea of 

chance to that of cause. Each death is the result of some 
particular event, or, to speak more correctly, is due to the action 
of a complex of many causes ; all these untraceable causes pro- 
duce on an average one death among 20 living ; the statement 
of the numerical chance is merely the summary of these effects. 
To say, then, that the number of deaths to be expected among 
1,000 is the same as the number of successes to be expected in 

1,000 attempts, the chance of success in each of which is ^, is 

not inconsistent with saying that the number of deaths is deter- 
mined by the action of a multitude of causes none of which by 
itself produces a great effect. In either case the laws of great 
numbers will be found to apply. The use of the intermediate 
numerical chance only facilitates calculation. 

Now suppose that a new cause is suddenly introduced, or the 

action of one of the causes is intensified (say, by an epidemic), 

Bffoot of a ^^^ ^^ once the whole scheme of calculation is 

predomiiumt thrown out, and we get a result which does not 

^"*' correspond to the probability calculation ; it is this 

non-correspondence which indicates the existence of a disturbing 

cause. 

Since the distribution in accordance with the curve of error is 
the result which maybe expected 4 /rxbn, whenever we are deal- 
ing with numbers generated in this way, it is clearly necessary to 
study this distribution before we can base any arguments on the 
variation of great numbers. When we have established the result 
which the independent actionof a very great number of individually 
unimportant causes can produce, then, and not till then, we are in 
a position to consider the effect of a predominant cause. We 
may even be able to deduce the existence of such a cause, for if 
we find by examination that a divergence of more than 3 per 
cent, from an average is improbable, and in a particular case we 
have a divergence of 30 per cent., we are either in the presence 
of a very improbable event, or some external predominant cause 
has influenced our numbers. 
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Section II. — The Equation of the Curve of Error. 

In this section it is proposed to develop the algebraic equa- 
tion and properties of the curve of error, bringing them into 

The leotion oan relation with the other sections. It will be not 
be omittod, impossible for non-mathematical readers to follow 
the great part of the argument of this branch of statistics with- 
out working through the mathematical proofs of the formulae ; 
and the book is so arranged that this section can be omitted. 
Other readers may turn this chapter through looking at the 

irat not witiiout large type only, and notice the main lines of the 
^*** arg^ument. For any thorough student of statistics, 

however, the mathematical proofs, which are so simplified in this 
chapter as not to involve the integral calculus at all, and the 
differential calculus only for two small points, are essential. In 
this section an acquaintance with algebra up to and including 
the exponential and logarithmic series is assumed. Starting 
from that point, the main formulae relating to the curve of error 
are deduced. 

Elementary Theorems in Probability. 



Definition. — If an event can happen in m ways, and fail in n-m 

ways, and all these ways are equally likely to occur, then — - is the proba- 
bility of its occurrence. 

Let — = /, and ^^^ = g: then/ + ^ = i. ^ is the chance that 
n n 

event will not occur. The odds in favour are/ to ^, those against are 

qtop. 

E.g., the chance that a card, drawn at random from a full pack, is a 

spade = 15 = i. 
52 4 
Theorem. — If /^/g ^^ ^^ chances of two independent events, then 
A x/2 is the chance that both will occur. 

Suppose that/i = — ,/2 = ~^« 

The first event may be expected m^ times in n^ trials, orm^n^ tiroes in 
n^n2 trials. 
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The second event may be expected m^ times in »2 trials, or m^m^ 
times in m^n^ trials. 

Hence the second event will occur at same time as the first miW^ times 

in »i«« trials ; that is, the chance of the double event is ^ ^ = A/r 

Examples. — Independent events. — The chance that two sixes will be 

thrown with a pair of dice = -:r x - = — -. 

6 6 36 

Dependent events. — The chance that three cards taken in succession 
from the same" pack shall prove to be ace, king, and queen of the same suit 

in any order is — x — x — = : for the chance that the first card 

52 51 50 5525 

drawn is an ace, king, or queen is — ; supposing it to be queen, 

52 

the chance that ace or king of same suit follows is — ; and the chance 

that the third draw gives the remaining card is — . 

The chance that 13 cards taken at random from a complete pack will 

contain 8 spades and 5 clubs is — L_ 5;* for 8 spades can be chosen 

in ^^Cg ways, 5 clubs in ^^Cg ways, and the hand may contain any such 
group of spades with any such group of clubs ; hence the numerator 
given corresponds to m of the definition given above ; also there are 
^^^Cjj equally likely possible hands of 13 cards, so that the denominator 
given corresponds to «. 

Theorem. — If n coins are placed at random on a table, the chance 

that r will show heads and the rest tails is — -'. 

2" 

For suppose there are n places to be filled each with a coin : — 

The first may show head or tail, two ways. 

The second may show head or tail, two ways. 

The first two places may therefore be filled in 2 x 2 ways. 

The n places may similarly be filled in 2" ways. 

Now r of these places can be chosen in "Cr ways ; and to each such 

selection corresponds one arrangement in which these r places are filled 



n \ 

* "C^ or ^C^ is written for ^= , the numbers of combinations of n 

n - r\ rl 

things r at a time ; and "P, or °P, is written for =zr , the number of Per- 

« — r! ' 

mutations of n things r at time. 
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with heads and the rest with tails (and many other arrangements not 
giving this result). 

Hence out of 2" possible arrangements, "C, give the result. 

Aliter, — Consider the product {h^ + t^ {h^-^- /q) • • • (Ai + 4)- 
Any term of this product, e.g,^ h^ ^2 ^8 ^4 's^* • • 4-i /'n corresponds 
to one arrangement of the n coins. 

The number of arrangements containing r heads and n-r tails is 
the same as the number of terms containing r A*s and n — rfs. 

This number is the same as the coefficient of h^ /""' is the expansion 
by the binomial theorem of {h + /)", which is obtained from the product 
above by writing h for h^^ h^ &c., and / for Z^, t^ &c. 

(A + /)° = A" + "Ci. h""-^ /+...+ "Cr /*"-' /'+...+ /». 

Hence the number of arrangements producing the required result 
is °Cr- The total number of possible arrangements is the sum of the co- 
efficients in this expansion; this is found by putting A = /= i. 

(i + O-^i+'^Ci +...+ °Cr +...+ I. 
Hence total number = 2". 

Notice that "Cr = "C„_r 

Example. — The coefficients in the expansion of (i + 1)*^ are as 
follows : — 



«k:, = 


"^^ 


= 


I. 


Corresponding chance 


* .0000000002 


««€, = 


»^c„ 


= 


32. 


,, 


.0000000074 


^, = 


'^c«, 


= 


496. 


„ 


.000000x155 


•*c, = 


•^s, 


= 


4,960. 


„ 


.000001155 . 


»«c, = 


•^^ 


= 


35,960. 


„ 


.000008375 . 


»*c, = 


""C^ 


= 


201,376. 


„ 


.00004688 . . 


»2Ce = 


''C^ 


= 


906,192. 


,, 


.0002110 . . . 


«C, = 


■"c^ 


= 


3,365,856. 


„ 


.0007837 . . . 


JeCg = 


«c^ 


= 


10,518,300- 


„ 


.002449 • • • • 


«C, = 


'^2, 


= 


28,048,800. 


,, 


.006530 


•^Qo' 


«^Q, 


= 


64,512,240. 


„ 


.01502 


'^c„ = 


«C,i 


a: 


129,024,480. 


% „ 


.03004 


^C^' 


■^c^, 


= 


225,792,840. 


,, 


•05257 


'«c„ = 


«Q, 


= 


347,373,600. 


„ 


.08088 


«Cm = 


"C« 


= 


471,435,600. 


„ 


•1097 


«c„- 


«C„ 


= 


565,722,720. 


„ 


•1317 


"^ 


= 


601,080,390. 


„ 


.1400 








2^2 = 4,294,967,296. 





The table just given shows that when 32 coins are placed on a 
table at random, the chance that 16 heads and 16 tails shall appear 

* Obtained by dividing each term by 2". 
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is .14, while it is more likely that either 15 heads (and 17 tails) or 
15 tails (and 17 heads) will be found, the united chances for these 
being .2634. The chance that the divergence from equal division shall 
not exceed 2 (/.tf., that there shall be at least 14 of each) is .1400 + 2 x 
(.i3i7 + .io97) = .6228; the chance that there shall be as many as 27 
of one kind is only .0001 1, i>., i in 9,000. 

The Binomial Expansion. 

The following table from Quetelet's Lettres sur la Theorie des 
ProbabiliUSy p. 375, shows a similar calculation when the index 
is 999 instead of 32. For instance, '^Cgoo (J)®^ = . 025225, the 
first quantity in Column 3. 

As the index of the binomial expansion is continually 
increased, the grouping of the figures takes a definite shape. 
The curve so obtained when the index is indefinitely great is 
called the curve of error. 

In the diagram at the end of the book, the line A^ F^ Fj 
represents the first half of the coefficients of {a+Vf\ the line 
Aj Gi Gg G3 G4 Gg represents the coefficients of («+3)^®; the^ 
line Ag H^ Hg Hg • • • ^^9 represents the coefficients of (a+^*^, 
and Aq Ai Ag A3 A^ is the curve of error. To fit these jagged 
lines to the curve of error, the maximum coefficient is repre- 
sented in each case by the line O Ag, and ordinates are drawn 
at equal intervals proportional to the other coefficients; the 
tops of these ordinates are then joined by straight lines. Th( 
interval between successive ordinates is decided by the con 
sideration that the area included between any chosen ordi 
nate, say Hj Pj, the base P^ O, the maximum ordinate O Ao» 
and the line Hg H^ Ag shall be the same fraction of the whole 
area Ag O . . . Hg . . . Ag, as the part of the area of the 
limiting curve of error cut off by the same ordinate is of the 
whole area bounded by O Ag, O X and the curve. 

The algebraic determination of this limit is given on pp. 
27 s seq. 

Suppose now that one ball is taken out of each of n bags, each con- 
taining m^ white and m^ black balls, the chance that r will be white 
and « - r black is — 

"Q/'. ^-', where / = ^ and ^ = ^ and »» = Wj + »»2. 
For r bags may be selected in "Q ways. The chance that each of 
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Scale of Precision. 

999 balls are drawn from a bag containing equally great numbers 

of black and white balls. 

Column I gives number of each colour. 

rank of deviation from equality. 
3 gives probability that balls will be draM 



iwn m proportion gi^n m 



2 gives rank of deviation from equality 

s pro ' 

Column 2. 
4 gives probability that deviation from equality will not be greater 

than that of given rank. 



Groaps of 



White. 
499 
498 
497 
496 
495 
494 
493 
492 
491 
490 

489 
48S 
487 
486 
485 
484 
483 
482 
481 
480 
479 
478 
477 
476 

475 
474 
473 
472 
47' 
470 
469 
468 
467 
466 

465 
464 

463 
462 
461 
460 



Black, 
500 
501 
502 

503 
504 

506 

507 
508 

509 
510 

511 
512 

513 
5H 
5x5 
S16 

518 
519 
520 

521 
522 
523 
524 
525 
526 

527 
528 
529 
530 
531 
532 
533 
534 
535 
536 
537 
538 
539 



3- 

Scale of 

Probability. 

Probability 
that such 

a group will 
be drawn. 



Scale of 
Precision. 

Sum of 

probabilities 

starting 

from most 

probable. 



025225 
025124 
024924 
024627 
024236 
023756 
023193 
022552 
02 I 842 
021069 
02C^3 
019372 
018464 
017528 
016573 
015608 
014640 
013677 
012726 

01 I 794 
010887 
010006 
009166 
008360 

007594 
006871 
0061 9 I 

005557 
004968 
004423 
003922 
003464 
003047 
602670 
002330 
002025 

001753 
001512 
001298 
001 I 10 



025225 
050349 
075273 
099900 
124136 
147892 
17 1085 
193637 
2x5479. 
236548 
256791 
276163 
294627 
12155 
3iJ8728 
344335 
358975 
372652 
385378 
397172 
40806P 
41^0 
427236 
435595 
443189 
450060 

456^51 
461809 
466776 

471199 
475122 
478586 
^481633 
484304 
486634 
488659 
490412 
491924 
493222 
494332 



Groups of 



White. 
459 
458 
457 
456 

455 
454 
453 
452 
451 
450 
449 
448 

447 
446 
445 
444 
443 
442 
441 
440 
439 
438 
437 
436 
435 
434 
433 
432 
431 
430 
429 
428 
427 
426 

425 
424 

423 
422 
421 
420 



Black. 
540 
541 
542 
543 
544 
545 
546 
547 
548 
549 
550 
55 iH 
552 
553 
554 
555 
556 
557 
558 
559 
560 
561 
562 

563 
564 
565 
566 

567 
568 

569 
570 
571 
572 
573 
574 
575 
576 

578 
579 



3- 

Scale of 

Probability. 

Probability 
that such 

a group will 
be drawn. 



.0009458 
.0008024 
.0066781 
.0005707 
.0004784 
.0<5b3994 
.0003321 
.0002750 
.0002268 
.0001863 
.0001525 
.0001242 
.0001008 
.COO0815 
.0000656 
.0000526 
.0000421 
.0000334 
.0000265 
.0000209 
.0000164 
.0000128 
.OOOOIOO 

.0000077 
.0000060 
.0000046 
.0000035 
.0000027 

.0C0002I 

.0000016 

.6000012 

.0000009 

.0000007 

.0000005 

.0000004 

.0000003 

.0000002 

.00000014 

.00000011 

.00000004 



Scale of 
Precision. 

Sum of 

probabilities 

starting \ 

from most 

probable. 



.495278 
.496081 
.496759 
.497329 
.497808 
.498267 
.498539 
.498814 
.499041 
.499227 
.•499380 

.499504 
.499605 
.499686 
.499752 
.499804 

.499847 
.499880 
.499906 
.499927 
.499944 
.499957 
.499967 

.499974 
.499980 

.499985 

.499988 

.4999912 

.4999933 
.4999948 
•4999960 
•4999969 
•4999976 
.4999981 
.4999984 
.4999987 

.4999989 
.4999990 

.4999991 
.4999992 



By means of this scale the binomial 999, practically equivalent to curve of error, 

can be fitted to and compared with any series of observation^QQQip 
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these will yield a white ball is pxpxpx ...tor factors, i.e., f \ the 
chance that each of the other n-r bags will yield a black ball is ^""' ; 
hence required chance is as stated. 

Aliier, 

OWl the white balls in the first bag ^^, ^w^, . . . iWm, ; 
„ black „ „ i^j, 1^2, ... i^„,; 

„ white balls in the second bag ^^, ^w^ . . . ga'm, ; 

„ black „ „ 2^1, 2^21 •• • 2^ni, ', 

and so on ; then all possible arrangements are represented by the in- 
dividual terms of the product — 

(l^i + ia/2 + . . . + iWm, + A + 1^2 + • • • + l^m,) X 

(aWi + aO'a + + 2«'m. + 2^1 + A + • • • + 2^ni,) X X « factors; 
^.^., the term ^w^ . 2^^ . 3^5 . ^w^ . . . n^m, represents one group. A w 
will occur r times and a b the remaining n-r times as often as the 
term lit' ^-' occurs in the binomial expansion of (w^ w + m^ by [where 
all the w's are put as a/, and all the ^'s as b"]. The coefficient of iif ^~' 
in this expansion is "C, . m^ . m,^~\ Total number of possible arrange- 
ments is w". Hence required chance is — 

w** \mJ \tnj 

E,g,, to find the probable number of sixes in n throws of dice. 
Here w = /, »»i=i, m^=^s,p = ^, q = ^. 



Probability of r sixes = "Cr (i)' (l)""' 



Suppose « = 12. Total number of possible arrangements is 6^^ = 
2,176,782,336. 



12 i 


sixes occur 


12^12-^ -5 


times = 


I 


II 


j> 


C ill ci 

12^11-^ .5 


»i 


60 


10 




12^^10-1 .5 


9) 


1,650 


9 




i2C,.i».S» 


>J 


27,500 


8 






» 


309,37s 


7 




,,c,,i\s' 


>> 


2,475,000 


6 




i2Co.i«.5^ 


)} 


14,437,500 


5 




rA.1^5^ 


» 


61,875,000 


4 




,2C,.i^5^ 


>> 


193,359,375 


3 




12C3.I^5^ 


)> 


429,687,500 


2 




12^.1^.5^" 


» 


644,531,250 


I 


9» 


i2Cx.i^5^^ 


)) 


585,937,500 





»> 


i2Co.i«.S^^ 


» 


244,140,625 






Total - 


2,176,782,336 
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The most probable number of sixes is 2, of which the chance is 
about f. In four-fifths of the trials there will probably be i, 2, or 3 
sixes. 

E^.^ to investigate whether drunkenness occurs chiefly on night of 
pay-day (suppose Saturday). If maximum number of convictions in a 
week is on Saturday in 10 weeks out of 12 selected at random, we have 

an event whose probability is only V-^ — :r = ^ (about), 

2176782336 1300000 

if the position of the day in the week had nothing to do with it. 

It must be noticed that the probability that event will occur 10 times 

out of 12 on any the same week-day is much greater, viz., — . 

Probability that event will occur at least 10 times is -^ — 5_ 

= -?- which is much the same as before. 

2176782336 

Similarly any questions depending on the occurrence of an event in 
the same month may be worked out. In this case ^1=1, ^2=11, 
n = number of years investigated. 

If a bag contains m balls of different colours, m^ white, tn.^ red, 
Wj green, &c., the probability that r^ white, r^ red, r^ green, &c., will 
occur is coefficient of /j'l /g'* Pz^ ... in expansion of (/^ ^-p^ +^3 + )" 

by the multinomial theorem, where /j = ^, /g = ^» &c. 

Notice that the probability of an event, if it was a chance 
occurrence, is not the same as the probability that the event wets 
a chance occurrence. 

If 13 trumps appeared in the same hand, we could not say 
that the chances were (^^Ci2 = ) 158,753,389,900 to i that the 
hand was "faked," but we should have strong though incom- 
mensurable evidence on the point. 

Deduction of Equation of Curve of Error. 

We can now proceed to the determination of the equation of 
the curve of error. 

The chance of r successes is greatest when r is the greatest integer 
\npn\ this is found by the ordinary method of determining the maxi- 
mum term in a binomial expansion. 

Let P be this maximum value = "Cp,,. /p° ^'*", making the supposition 
for brevity that pn is integral, which will not affect the proof. 
\n 
lpn\qn 
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Let P, be chance oipn + x white balls. 

Then P.= P X f^Vx gnAgn-^) ^^ ^{qn^x ^- ^) 

\ql {pn + I) (/« + 2) . . . (/« + x) 

_ \ qn) \ qn) " \ q n ) 

' " (■%^>(-.ii)-(-l)" 

Taking logarithms of both sides — 
logP. = logP + log(x-i)+log(x-^^)+...+ log(x-^) 

-log(x+JL)-,og(x-.J-)-...-log(x+5^)-log(x+i^) 
\qn 2 {qny / \qn 2 \qnj J 

-(£:iiH.i(£ziy+) 

\ ^« 2 \ qn / / 



_lp l + 2+ ... +a:-i _ i2+2g-f ... -fJC- I 

^ ^« 2^2«2 

_ 1+2+ . .. +JC: I + 2^+ . . . J-^ _ 

/« 2/'^«2 + • • • 

= log P - ^(^-0 _^(^+0_('y-0-^-(2-y-l) ^ X{X+X){2X+1) _ ^ 

2qn 2pn \2q^n^ \2p^f^ 

Now when «, the number of balls in each bag, is very great, pn and 
qn are also very great if neither / nor q are very small ; x^ the diver- 
gence from/«, ranges from O to qn on the positive side of the maximum, 
and O to -pn on the negative side. The chance of so great a diver- 
gence as -pn ox qn is very small. The chance of a small divergence, 
such as ^= I, 2, 3 . . . is very nearly equal to the maximum chance P. 
For instance, if :v= 3 — 

P = P X (ty X ■ qn(qn-i)(qn-2) 
* \qJ {pn + i) {pn + 2) {pn + 3) 

-(-f,)('-f,)(-ii)"'(-.^)-(-%^)" 

«= (expanding each term) Pxi-A- — + terms involving i . 

L nq np ° n^J 
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EQUATION OF THE CURVE OF ERROR. 2/7 

Hence, in order that P, may have at once a finite value and one 
with a finite divergence from P, x must be very great compared with 
unity, but small compared with «. 

Re-write the above equation for P„ neglecting [-\ 

WP. = .c,P-f?(l.I).^4(j,-J,)-|( )^. 

If __ were negligible, log P, = log P. 
n 

X • JC® x^ 

If - were finite, and therefore x infinite, both -- and — , would be 
n n^ n^ 

infinite. 

That part of the resulting curve, which shows finite curvature, is 

x^ 
found by assuming that x and n are infinite, but — finite. 

[The general argument is similar to that used for constructing the 
finite part of a parabola on a finite scale, for there =^ is finite.] 



x^ /x^\^ I x^ fxP\'^ I 
On this hypothesis -« = ( — ) . -, -5 = I — ) . -» and these and pre- 

sumably further terms are infinitesimal. The equation of the finite 
part of the curve is therefore — 

_ x= 

or P:, = Y.e "P*», since/ + ^ = i. 

x^ 

Writing;/ for P.,^' = Per ^^, 

The curve is horizontal near the maximum ordinate, for Px = P when 
X is small, and extends to infinity in both directions, the axis of x being 
' an asymptote, for when ^ = + ^^^'j J' is zero. 

When — - is negligible, the curve is very approximately symmetrical ; 

71 

this symmetry may be shown to extend over the finite part of the 
curve, when n is large.* The annexed diagram illustrates the extent of 
the asymmetry for a small value of n, 

* By considering the values of the various quantities in relation to the 
table on p. 281. 
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Relation of Curve of Error to Statistics. 

The following example shows how Quetelet fitted his figures 
to given observations : * — 

Chest Measurements of Scotch Soldiers. 













r-- 






i 


I. 


2. 


3- 


4- 


5- 


* 6, 


7- 


8. 


9- 


Chest 

Measure- 
ment, 
Inches. 


No. of 
Men. 


Pro. 
portionol 

Nos. 


No. 
between 

given 
^feasu^e'• 
ment and 

Mean. 


Rank in 
Scale of 
Precision. 


Calculated 

Rank of 

Measure- 

ment- 


Precision 
of Cal- 
culated 
Rank. 


Calculated 

No. cf 
Observa- 
tions to 

each 
Measure- 
ment. 


Differ- 
ences , 
between ' 
Columns 
3 and 8. j 

1 










f 








' 


33 


3 


5. 


5,000 


... 


... 


5,000 


7 


2 


34, 


18 


31 


4,99*-: 


+ 52 


50 


4,993 


29 


2 


35 


81 


141 


4,964 


42.5 


42. 5 


4,964 


no 


3' 1 


36 


i«5 


322 


4,823 


33*5 


34.5 


4,854 


. 323 


I ' 


37 


420 


732 


4,50J 


26.0 


26.5 


4.53« 


732 


1 


38 


749J 


1,305 


3.769 


18.0 


18.5 


3,799 




28 1 


39 \ 


1. 0/5^', 


i 1,867 


2.464 

597 


10.5 


10.S 
2.5 


2,466 
628 


1,838 


29 


• 40 


1,079 


1,882 


1,285 


-5.5 


5-5. 


1,359 


1,987 


los 


41 


934 


1,628 


2.913 


15 


13.5 


3.034 


1,675 


47 


42 


658 


1,148 


4,061 


21 


21.5 


4.130 


1,096 


52 1 


43 


370 


645 


4,706 


30 


29.5 


4,690 


560 


85 : 


44 


92 


l()0 


4.866 


35- 


37.5 


4.9»i 


221 


61 


45 


50 


87 


4,953 


41 


45.5 


4,980 


69 


iS 


46 


21 


38 


4,991 


49.5 


53.5 


4.996 


16 


22 


47 


4 


7 


4,998 


-56 


61.8 


4,999 


3 


4 


48 


I 


2 
10,000 


5.000 


... 


... 


5,000 


I 


I 


... 


5.738 


... 








10,000 


488 


















""— 



The chest measurements of 5,738 soldiers were ranged in 
order of magnitude, and the numbers of men at each measure- 
ment placed in column 2 against the corresponding number 
of inches in column i. Column 3 gives numbers proportional 
to those in column, such that their sum is 10,000. It is assunried 
that there are 5,000 cases on each side of the (unknown) mean ; 
then 5,000 cases occur between 33 inches and the mean, 4,995 
between 34 inches and mean, and so on, till we find (in column 4) 
597 cases between 39 inches and mean. Similarly 1,285 occur 
between 40 inches and the mean, ^q^ L/ yio^, 

Referring now to the scale of precilioi^ we find that 4,995 
cases corresponds to rank 52 ; 4,964 to rank 42.5, and so on. 
The numbers of these ranks are placed in column 5. 

Now if the observations fitted the curve exactly the distances 



/duf,, p. 400. 
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EQUATION OF THE CURVE OF ERROR. 279 

between two ranks corresponding to two successive inches should 
be always the same. This is not exactly the case : 34 inches 
corresponds to rank 52 ; 35 inches to 9.5 ranks lower, 42.5 ; 
36 to 9 ranks lower, and so oyi. It is necessary to assume 
some regular interval, which will show as close a correspondence 
as possible between theory and observation. It is assumed that 
a difference of i inch corresponds to 8 ranks, and column 5 is 
" smoothed " into column 6 on this hypothesis. The process is 
then reversed ; against each rank in column 6 is plziced in 
column 7 the corresponding number from the scale of precision. 
Column^ 8 is then calculated from column 7 in the reverse way to 
that by which column 4 is reckoned from column 3. ^ The close- 
ness of the resemblance between columns 8 and 3 shows how 
nearly the measurements fit the theory. In column 9 ace placed^ 
the differences between the numbers in columns 3 and 8 ; the 
percentage which the sum of these differences is of the total 
number (io,ooo),-is a measure of the fit. If the observations are 
plotted out in the same way as the later figures which form the 
Lj, Lg . . . figure on the diagram at the end of the book, this 
ratio of the sum of the differences to the whole number, is a 
rough measure of the ratio of the sum of the areas included 
between the lines L^ L2 . . . and the curve of error AjAgAg. 
In the case just discussed the misfit is 4.88 per cent. 

The following seems a better method of estimating the fit, 
for it is less dependent on the accidental divergences caused by 
the particular interval of measurement taken. Construct ^ figure 
to represent the scale of precision on p. 273 ; fit as closely as 
possible to this another figure, whose ordinates represent the 
numbers " at or above " given measurements represented by the 
abscissae ; the whole curve will be nearly the shape of the figure 
facing p. 155, when the smoothed curve may stand for the scale 
of precision, symmetrical about its median, and the original 
jagged line for the observations. The closeness of the jagged to 
the smoothed line shows the fit ; and this will not be altered by a 
slight shifting of the inch or shilling limits we adopt, which in 
the other method often makes a great difference to the regu- 
larity* in discontinuous observations. Moreover, it is not 

♦ E,£'., two great numbers at 29s. 9d. and 30s. 3d. respectively will both 
be in the same group if our limits are " 29s. 6d. and not so much as 30s. 6d.,'' 
&c, but in different groups if the limits are "29s. and not so much as 
30s.," &c. 
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280 ELEMENTS OF STATISTICS. 

necessary by the latter method to sort the observations into 
groups at all. 

We cannot deduce this equation from the most general 
hypotheses (stated on p. 303, infra) without the use of the 
integral calculus. It is, however, more convenient to use a scale 
of precision evaluated from the equation of the curve as obtained 
in other ways. In the next table the numbers under x corre- 
spond to Quetelet's " ranks," but the divergence taken as unity 
corresponds to the quantity J2pqn which is rank 22.35 • • • 
(for n/2 xi X i X 999 = 22.3s . .) in Quetelet's scale. The quantities 
under F(;r) correspond to those in the earlier scale of precision ; 
so that against any value of x is found the chance that an 
observation shall be between the average and x. The figures 
are adapted from Lexis' Massenerscheinungen, pp. 93, 94, and 
Quetelet, ibid,, p. 389. This table and Quetelet's can be used 
indifferently ; they yield very nearly the same results. 
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Values of F(^) for Different Values of x, where 
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jr. 


F(-r). 


2.20 


.49907 


2.50 


.49980 


3.00 


.499.989 


4.00 


.499,999,992 


5.00 


.499,999,999,999,2 



jr. 


FW. 


X. 


F(x). 


X. 


Fix). 


X. 


F(xy. 


X. 

I.4I 


Fix). 


.00 


.000 


.36 


•195 


71 


.342 


1.06 


.433 


'H 


.01 


.006 


.37 


.200 


72 


.346 


1.07 


.435 


1.42 


.478 


.02 


.Oil 


.38 


.205 


73 


.349 


1.08 


•437 


1.43 


.478 


.03 


.017 


.39 


.209 


74 


.352 


1.09 


.438 


1.44 


•479 


.04 


.023 


.40 


.214 


75 


.356 


1. 10 


.440 


1^45 


.480 


.05 


.028 


.41 


.219 


76 




I.II 


.442 


1.46 


.481 


.06 


.034 


.42 


.224 


77 


.362 


1. 12 


.443 


1.47 


.481 


.07 


•039 


•43 


.228 


78 


:?a 


I.I3 


.445 


1.48 


.482 


.08 


•045 


.44 


.233 


79 


1. 14 


.447 


1.49 


.482 


.09 


.051 


.45 


,238 


80 


.371 


'.15 


.448 


1.50 


.483 


.10 


.056 


,46 


.242 


81 


.374 


1. 16 


.450 


1.52 


.484 


.11 


.062 


.47 


.247 


82 


.377 


1. 17 


.451 


1.54 


•4|| 


.12 


.067 


.48 


.251 


83 


.380 


I.18 


.452 


1.56 


.486 


•13 


.078 


.49 




84 


.383 


1. 19 


.454 


1.58 


•487 


.14 


•50 


.260 


85 


:S 


1.20 


.455 


1.60 


.488 


.15 


.084 


•51 


.265 


86 


1. 21 


.456 


1.62 


.489 


.16 


.090 


.52 


.269 


87 


.391 


1.22 


.458 


'ii 


.490 


.17 


.095 


.53 


.273 


88 


.393 


1.23 


.459 


1.66 


.491 


.18 


.100 


.54 


.277 


89 


.396 


1.24 


.460 


1.68 


.491 


.19 


.106 


.55 


.282 


90 


.398 


1.25 


.461 


1.70 


.492 


.20 


.III 


.56 


.286 


91 


.401 


1.26 


.163 


1.72 


.493 


.21 


.117 


.57 


.290 


92 


.403 


1.27 


.464 


1.74 


.493 


.22 


.122 


.58 


.294 


93 


.406 


1.28 


.465 


1.76 


.494 


.23 


.128 


.59 


.298 


94 


.408 


1.29 


.466 


1.78 


.494 


.24 


.138 


.60 


.302 


95 


.410 


1.30 


•4^2 


1.80 


.495 


.25 


.61 


.306 


96 


.413 


I.3I 


.468 


1.82 


.495 


.26 


.143 


.62 


.310 


97 


.415 


1.32 


.469 


'•fl 


•4?l 


.27 


.149 


.63 


.314 


98 


.417 


1.33 


.470 


1.86 


.496 


.28 


.154 


.64 


.317 


99 


.419 


1.34 


.471 


1.88 


.496 


.29 


.159 


.65 


.321 I 


00 


.421 


1-35 


.472 


1.90 


•4?^ 


.30 


.164 


.66 


.325 I 


01 


•423 


1.36 


.473 


1.92 


.496 


.31 


.169 


.67 


.328 I 


02 


.425 


'•37 


.474 


'•?t 


.497 


.32 


.175 


.68 


■332 I 


03 


.427 


1.38 


.475 


1.96 


.497 


.33 


.180 


.69 


.335 I 


04 


.429 


1.39 


.475 


1.98 


.497 


.34 


.185 


.70 


.339 I 


05 


.431 


1.40 


.476 


2.00 


.498 


.35 


.190 














2.05 


.498 



Special Points on the Curve. 

Before we can show how to fit observations to this table, we 
must consider the equation of the law of error more closely. 

Definition. — The probable error of a series of observations 
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is that divergence from their mean on either side, within which 
exactly half the observations He. This quantity is more appro- 
priately called the Quartile Deviation,^ 

In the scale of precision the corresponding number is .25, 
either above or below the mean ; Ihe corresponding rank in 
Quetelet's scale is about 10.7. In the table just given, the value 
of F(^) which corresponds to the probable error is .25, and the 
corresponding value of x is calculated to be .47694, a quantity 
usually designated by p. 

An approximation to the probable error for a given series 
of observations is obtained by arranging all the observations 
in order of magnitude ; marking the magnitude, say «, above 
which 25 per cent, of the observations lie, and the magnitude, 
say ^, below which 25 per cent. He. Half the difference between • 
a and /? is the probable error. 

A useful way of illustrating this is to say that if one obser- 
vation is chosen at random out of a group, it is as likely as not^ 
that it will lie between the average and the probable error. 

In the figure given at the end of the book, the probable 
errors are at the points P^, Pg for the curves A and C, and /p/g 
for the B. 

By means of the approximation given in the last paragraph 
but two, the; curve of error can be fitted to a series of observa- 
tions by equating the probable error so determined to the value 
of « = .47694, and comparing the values of F(;r) with the ranged 
observations ; but though this method is simple and rapid it is 
not the best. 

By a suitable change of scale for ordinate and abscissa the 
equation given on p. 277 can be written ^ = ^~''^, and this is the 
most general equation of the normal curve of error. 

If;r=^,^=i ; hence the unit of ordinate is the number of 
cases at zero error; from the table on p. 281 above, the unit of 



+00 



abscissa is the probable error -r .47694. Since f e ^ ,dx is 

shown in the integral calculus to be ^^, the area contained between 
the curve and the axis of x is ^r. If the equation is written 

7 = ~-=.^"^* its area is i, that is, unit of area equals unit of 
probability (that is certainty) and the area contained by any two 



* See Yule in Statistical Journal^ 1896, p. 33a 
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ordinates, the curve and the axis of abscissa equals the probability 
of an occurrence between the errors represented by the abscissae 
of those ordinates.* 

If the curve is traced from either of the tables, it will be 
found that it changes from concavity to convexity on each side 
of the maximum ordinate, at such points as s^ jg ^3 s^ in the 
following diagram. If the unit of abscissa is taken as a concrete 
quantity, say, I inch, and the abscissa (OSg) of this point of in- 
flexion is €, in the same units, then the equation of the curve is — 

I --4 , , d^y -~ /^2 i^ = O, if jc = ± €, and 
^._., >.-, for then ^ = . '''\^- ^) 

therefore the points ( -*» "7^) ^^^ points of inflexion. 

Let 2e2=:r^ = -_, then the equation is^' = — -.e ^' oxy^ — r^-***** 

+00 _X7 

The area of this curve is f -^ x "^-dx^c. 

-co 

Choose the unit of ordinate so that this area shall be unity. 
Then the equation is — 

y = -e~ ^' = — ir.^-h'x» 

Cy which thus determines the unit of abscissae of the curve, is 
called the modulus. 



Determination of the Modulus. 

Suppose we have a series of observations which we know are 
selected from a group which conforms to the law of error, it is 
required to find, from the observations, the centre and modulus 
of the curve from which they would come with least impro- 
bability. 

Let Jt^i, jCg, . . . vTn be the observed values. Let x be their arith- 
metic mean. Let S^, So ... be the divergencies of the values from 
this mean. 



* Hence chance of error between x and x-k-dxis — z: ,e~^^.dx. 
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Then S^ =i x^ - x, 8^ = x^ - x, . . . ; ^B = 2j°jc - nx=^Oy for 

- 2 "^ 
x = — ' — • 
n 

Let the equation of the curve to which the observations belong be 
Jx-vy 
y = ^-_. e ^' , where c and k have to be determined. 
c ^v 
Let Ji, J'2» • • • .^o • • • ^n ^^ ^^^ values of 7 corresponding to 

•^p '^2' • • • -^r • • • "^n* 

Then y^dx is the chance that an object taken at random from the 
group conforming to the curve shall be between x^ and s^ 4- dx* 
Let P be the chance that the n given observations occur together. 
Then P = J^i- ^'s • • • A 

(x,-k)« ix^-Vy (xn - k)s 

I " c- c' ~" ~^' " 

== —;,e X e x...x x e ^' 

Now 2^(ji:->&)2 = 2^(5 + a:->^)2 = S^S^ + 2.(x: - /6). 25 + «.(:c - >&)2 
== 2§2 + « (i - ky^ since 28 = 0. 

Whatever value we assign to ^, P will be greatest when the quantity 
{x - k)^ is least, that is when k = x. 

Giving k this value, 



P = 



I -^ 






>/P -n-i -n-3 _!l _^ 



In order that P may be a maximum, — - must = O.t 

ac 

2252 + 
Hence ^=?:=:^ + 
n 

Thus the curve required has its centre at the arithmetic 



♦ When the magnitude of the observations is discontinuous, as in 
Quetelet's scale, no dr is necessary ; but if the magnitude is continuous the 
probability of any defined error is zero, and the y is the chance of an error 
between infinitesimal limits. 

t And -5-5- be negative, which can be shown to be the case here. 

dc^ 

I The proof here given is based on Merriman's Method of Least Squares ; 
but it is suggested that his statement " for a given system of errors, it must 
be considered that the observations have been as precise as possible,** § 65, 
•is unnecessarily obscure. 
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mean of the observed values and modulus equal to a/ 

where the 3*s are the differences between the observed values 
and their mean.* 

^r, so determined, is called the modulus, //=- is called the 

c 

precision. Professor Edge worth proposes to call ^= — the 

fluctuation.^ 

It can be shown that half the curves = — j^e *^' is included 

C ijir 

between the ordinates corresponding to ±r, when r=. 47694 r, 
as found from the scale of precision. 

It can be shown that the arithmetic average (^) of all the 
errors, considered positive, J is given by — 

7/ = —-, whence r = .8453 V' 

t) is the abscissa of the centroid of the positive half of the 
curve. 

rf is more easily calculated from the observations than ^, and 
can in some cases be used in its stead. § ^ is called the average 
error or mean of errors. 

In the table given on p. 281, the modulus is taken as i ; 
when x=^ i, F (;i:) = .42i . . ; that is .421 . . of the curve lies between 
the ordinate corresponding to the abscissae equal to the modulus 
and the central ordinate. When ;r= 2, F (^r) = .4976 . . Hence the 
chance of an observation showing a divergence from the mean 
on either side of more than twice the modulus = .005 . . ; the 
corresponding rank in Quetelet's table is 45.1. 

€ used on p. 283, is now seen to be equal to ^ /— and is 

called the error of mean square^ or the standard deviatiott. H 

* See also p. 307, infra, 

t Stat.foumal: Jubilee Number, p. 188 ; and p. 298, infra. 






§ Stat.fournaiy loc. cit. 

IT Both € and y\ have been called mean error^ and this term has become 
misleading. 
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In the diagram at the end OP2, OPi are the probable errors, 
OMj, OM2 the moduli, OSp OSg the errors of mean square 
and OEi, OEg the mean errors for the curves A and C. O/i, 
O/gj 0/«i, Om^ 0^1, 0^2> 2ire corresponding quantities for the 
curve B. It is to be noticed that the line XOXj is an asymptote 
of each of the three curves drawn. In the figure OG5 equals 
about twice the modulus. The ratio of the sniall area to the 
right of a vertical line through G5 to the area of half the curve is 
the chance t^at twice the modulus shall be exceeded. The 
distance between OXj at three times the modulus and the curve 
is too small to be shown. 

From the foregoing it will be seen that any curve of error, 

X* 

y = —^ *^' > can be obtained by projection from the same 

standard curve _;/ = ^~**, just as any ellipse can be obtained by 
projection from a circle ; but as ellipses differ from one another 
in virtue of different values of their eccentricity, so curves of 
error differ from one another in- virtue of different values of their 
modulus. As we have seen, on any such curve there are certain 
definite points (the positions of the modulus, mean error, pro- 
bable error, and error of mean square) ; if then we have the same 
units of abscissae, such as i inch, for two sets of observations, 
these points will take different positions. If for one set the 
modulus is 2 inches, and for another i inch, .843 of the observa- 
tions will be within 2 inches of the mean in the first case, and 
within I inch of the mean in the second. If we regard the 
observations as attempts to hit the mean and the divergencies 
as errors, the aiming in the second case is ten times as precise 
as in the first. The precision thus defined is in inverse proportion 

to the modulus, and is therefore suitably measured by A = -. 

If the standard form of equation is adopted in both cases, the 
area of both curves will be unity, and their actual shapes those 
of Ci Cg Cg and B^ Bg B3. 

The calculation of the precision of a set of observations does 
not require either of the tables, but is simply the evaluation of 

h = fj -^^ from the observations themselves. 

Another form of the equation in common use is y = -^e *^' 
where n is the whole number of observations in question. The 
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area of this curve is «, so that unit area corresponds to one 
observation. 

Compare now this form with that obtained from the limit 

of the binomial expansion, ^= P,e »»?«? 

l{x=^o,j/=P ; hence P is the maximum ordinate. 
Adjusting the unit of ordinate so that the area of the curve 

X* 

ft - — - 
Hence 2npq = (^^ and therefore 2np{i'-p) = c^. 



Examples. 

I. -The following figures, which are taken from Professor 
Westergaard's Die Grundziige der Theorie der Statistik, but 
treated in a manner different from his, will serve to illustrate the 
meaning of the formulae, and show how to fit a curve to observa- 
tions ; limits of space prevent more elaborate examples. 



Births in Denmark. 





Number. 






Year. 






Percentage Boys 
of Total. 


Difference from 
Average. 








Total. 


Boys. 






i860 


54,797 


28,308 


51.66 


+ .23 


1861 


53,747 


27,506 


51.17 


-.26 


1862 


53,011 


27,300 


51.50 


+.07 


1863 


53,939 


27.841 


51.62 


+.19 


1864 


52,884 


27,334 


51.68 


+.25 


1865 


55,434 


28,483 


51.38 


-.05 


1866 


57,353 


29,747 


51.87 


+.44 


1867 


54,763 


28,036 


51.20 


-.23 


1868 


56,546 


28,985 


51.26 


-.17 


1869 


54,056 


27,577 


51.02 


-.41 


1870 


56,472 


29,144 


51.60 


+.17 


187 1 


56,407 


29,045 


51.49 


+.06 


1872 


57,274 


29,462 


51.44 


+.01 


1873 


58,616 


30,115 


51.37 


-.06 


1874 


59,324 


30,594 


51.57 


+.14 


1875 


61,791 


31,784 


51.44 


+ .01 


1876 


63,967 


32,912 


5^98 


+.02 


'!77 


63,772 


32,508 


-.45 


1878 


63,144 


32,505 


51.48 


+.05 


1879 


64,363 


33, "4 


51.45 


+.02 


Average - 


57,583 


... 


51.43 


... 
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Calculated modulus J {2 x 57583 x .5143 x 4857} = 169.61 for 
a total of 57,583. Equivalent to .2945 ... for a total of 100. 

In the formula Jipqtty p the chance that any child bom is a 
boy is taken as .5143, since 51.43 is the average percentage male 
births are of total births. ^=1— / = .48S7. The number of 
experiments n is taken to be the average number of births per 
year. Then Jipqn is found to be 169.61. This is the modulus 
to apply to the whole number of births ; but since this differs 
year by year it is convenient to reduce all numbers to per- 
centages. The examples are then arranged as " between average 
+ modulus and average — modulus," " between average + ^ of 
modulus and average — /^ of modulus," and so on ; the first 
group ( + .460) is taken so as to include the extreme. 









Calculatbo. 


Within. 


Observed. 






jr. 


F<jr>X2. 


51.44 ± .460 


20 




.972 of 20=19.4 


51.44 ± .295 


17 


I. 


.843 /r =16.8 


51.44 ± .265 


17 


.9 


.797 " =15-9 


51.44 ± .236 


15 


.8 


.741 » =14-8 


51.44 ± .206 


13 


.7 


.678 // =13.6 


51.44 ±.177 


12 


.6 


.604 n =12.1 


51.44 ±.147 


10 


5 


.521 » =10.4 


51.44 ±. 118 


9 


.4 


.428 n = 8.6 


51.44 ± .088 


9 


•3 


.329 n = 6.6 


51.44 ± .057 


6 


.2 


.223 /r = 4-5 


51.44 ± .029 


4 


.1 


.112 n = 2.2 



The numbers to be expected from theory are given along- 
side; the fit is fairly close, and can be tested by the method 
described on p. 279. 

The modulus calculated by the formula ^? — is .305. 

II. If a digit is taken at random, the chance that it will be 
less than 5 (o, i, 2, 3, or 4) is \. If we take a book of logarithms 
and note the digits in the 7th decimal places in successive 
numbers, we shall have a practically random selection of digits. 
If we take groups of 50 numbers, the chances of o, i, 2 ... 50 
occurrences of digits less than 5 are given respectively by the 
terms of the expansion {\ + i)". This experiment was repeated 
300 times and the results tested in accordance with both the 
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formulae for the modulus, viz., J2pqn= V2X Jx|xso=S, and 

i=_, which was found to be 4.75. 
n 

The values of the other standard errors were both found 
directly and deduced from r, with very fair correspondence be- 
tween the two methods. 



Number of Occurrences of o, i, 2, 3, or 4 in the 7TH 
Decimal Places of Groups of 50 Logarithms. 























Averages of 






















Lines. 


29 


19 


25 


25 


22 


28 


16 


23 


22 


27 


23.6 


24 


28 


30 


22 


20 


27 


24 


24 


27 


22 


24.8 


27 


26 


28 


21 


21 


22 


22 


27 


25 


25 


244 


25 


28 


21 


23 


22 


23 


27 


27 


25 


25 


24.6 


28 


23 


26 


23 


22 


29 


28 


25 


23 


26 


25.3 


24 


22 


22 


19 


26 


24 


26 


28 


20 


25 


23.6 


24 


23 


27 


29 


26 


21 


26 


31 


23 


27 


25.7 


26 


26 


30 


25 


25 


24 


29 


25 


21 


27 


25.8 


30 


24 


25 


27 


24 


30 


24 


28 


24 


30 


26.6 


26 


21 


21 


22 


31 


28 


26 


26 


26 


33 


26.0 


19 


25 


26 


34 


21 


28 


21 


29 


19 


23 


24.5 


29 


26 


19 


29 


24 


27 


25 


25 


24 


22 


25.0 


24 


27 


21 


23 


25 


21 


26 


28 


25 


27 


24.7 


25 


28 


29 


30 


28 


27 


28 


23 


25 


26 


26.9 


30 


30 


18 


22 


24 


23 


25 


27 


25 


31 


25-5 


25 


26 


27 


21 


23 


25 


24 


20 


25 


22 


23.8 


28 


24 


20 


18 


25 


19 


25 


30 


29 


25 


24.3 


22 


27 


24 


28 


22 


20 


23 


25 


26 


26 


24.3 


16 


27 


28 


27 


23 


20 


29 


26 


20 


24 


24.0 


26 


28 


28 


23 


21 


24 


25 


21 


14 


28 


23.8 


25 


24 


21 


24 


24 


21 


24 


30 


25 


26 


24.4 


28 


32 


17 


23 


29 


24 


22 


33 


29 


29 


26.6 


25 


26 


26 


27 


22 


20 


24 


26 


24 


24 


24.4 


28 


26 


25 


25 


29 


25 


24 


22 


26 


25 


25.5 


22 


30 


22 


27 


25 


27 


27 


27 


28 


21 


25.6 


35 


26 


23 


26 


31 


28 


26 


22 


22 


29 


26.8 


26 


20 


28 


23 


22 


28 


24 


23 


30 


16 


24.0 


28 


26 


25 


31 


27 


28 


22 


26 


30 


19 


26.2 


26 


27 


27 


18 


25 


24 


27 


22 


30 


30 


25.6 


26 


25 


24 


17 


27 


32 


25 


21 


23 


30 


25.0 



Averages of Columns. 
25.9 25.7 24.4 24.4 24.5 24.9 24.8 25.7 24.5 25.7 
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I. 


2« 


3- 


SqaaresoT 
Error. 


Product Of Col. a 


Product of Col. 2 






Errors. 


aud Col. 3- 


and Col. 4. 




Times. 










14 occurs I 


-11.04 


121.88 


-11.04 


121.88 


16 


3 


- 9.04 


81.72 


-27.12 


245.16 


17 


/f. 2 


- 8.04 


64.64 


-16.08 


129.28 


18 


• 3 


- 7.04 


49.56 


-21.12 


148.68 


19 


7 


- 6.04 


36.48 


-42.28 


228.60 


20 


" 2 


- 5.04 


25.40 


-45.36 


21 


IT 18 


- 4-04 


16.32 


-72.72 


293.76 


22 


' 26 


- 3.04 


9.24 


-79.04 


24a 24 


23 


• 21 


- 2.04 


4.16 


-42.84 


87.36 


24 


n 32 


- 1.04 


1.08 


-33.28 


34.56 


25 


IT 42 


- .04 





- 1.68 





26 


• 36 


+ .96 


•2^ 


*" 34-56 
+ 58.80 


33.12 


27 


' 30 


+ 1.96 


3.84 


115.20 


28 


» 28 


+ 2.96 


8.76 


+ 82.88 


245.28 


29 


• 15 


+ 396 


15.68 


+ 59.40 


235.20 


30 


IT 16 


+ 4.96 


24.60 


+ 79.36 
+ 29.80 


393.60 


31 


5 


+ 5.96 


35.52 
48.44 


177.60 


32 


B 2 


■f 6.96 


+ 13.92 


96.88 


33 


• 2 


+ 7.96 


63.36 


+ 15.92 


126.72 


34 


• I 


+ 8.96 


80.28 


+ 8.96 


80.28 


35 


n I 


+ 9.96 


99.20 


+ 9.96 


99.20 








Sum of Errors, all 




... 


300 


... 


considered positive 785* ' 2 


c 3387.96 
Sum of Squares. 



Average, 25.04. Median, 25. Quartiles, 27, 23. Hence probable error is 
approximately 2. 



w 785.12 

Mean error = j\ = 2. 6 1 7. 
300 ' 



Error of mean square 



-y^ 



3387.96 
300 



=3.36=«- 



ModuIus= c X V2=4. 75 *f. 

Also probable error =.4769 f =2.265 = r, and mean error =r+. 8453 = 2. 68, nearly 
the values obtained directly. 

The following table compares the distribution with that of 
the normal curve, by a method differing from those previously 
used. 
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Divergence of 1 


from average 


corresponds to x- 


= -^ — =.2I. 
4.75 


Between 
Average and 


Observed. 


jr. 


aF(^). 




Above 


Below 








Average 


Average. 


Average. 


Total. 






± I 


& 


+ 42 = 


78 


.21 


.117 of 600= 70 


± 2 


+ 74 = 


140 


.42 


.224 n =134 
.314 /r =188 


± 3 


94 


+ 95 = 


189 


.63 


± 4 


109 


+ 121 = 


230 


.84 


.383 -» =230 


± 5 


125 


+ 139 = 


264 


i.os 


.431 n =259 


± 6 


130 


+ 148 = 


^l^ 


1.26 


.463 • =277 


± 7 


132 


+ 155 = 


287 


1.47 


.481 /f =289 


± 8 


134 


+ 158 = 


292 


1.68 


.491 # =295 
.496 • =298 


± 9 


135 


+ 160 = 


295 


1.89 


±10 


136 


+ 163 = 


299 


2.10 


.499 n =299 


±11 


136 


+ 163 = 


299 


2.31 


•499 " =300 


±12 


136 


+ 164 = 


300 


2.52 


.499 " =300 



The fit is close. The symmetry is spoilt by the great 
number 42 at 25 occurrences, just below the average. 

The line L^Lg . . . h^ on the diagram of the curve of error 
at the end of the book shows these numbers. The moduli are 
made to correspond, which defines the abscissae, and the scale 
of ordinates is then decided by making the areas of the two 
figures equal, but this was not done exactly. 

Summary of Terms. 

For convenience of reference the principal quantities con- 
nected with the curve of error are collected below. 



If we take^ = 



tf *^' as the equation of the curve, c which 



^VTT 



determines the unit of abscissae is called the modu/us. 

If Sj §2 3-^^ *^ differences between the observations and 

their arithmetic average, c should be taken as ^ or 

j^ , where n is the number of observations. (See pages 

▼ n — I 

285 and 307.) 

If the curve can be also determined as the limit of an 

assigned binomial expansion (p+g)% then c may be taken as 

equal to ^2pqn^ (See page 287.) 

z — or is called the fluctuation^ and equals (?. (See 

n ft— 1 

pages 285 and 307.) 
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=y? 



A, = -, is ^e precision. 

The square root of the average of the squares of the 8's, 

— , is called the error of mean square, or the standard 
n 

deviation, and is represented by the letters € or <r. 

Hence r=c. J^. 

The arithmetic average of all the 5's, all reckoned bs positive, 
is called the average error. It is equal to the distance of the 
centre of gravity of half the curve from the central ordinate. It is 

represented by the letter ?;, and i7 = -7=. = .s64i89&. 

The probable error is half the distance between the quartiles 
of the observations. The ordinates through the points whose 
abscissae are the probable errors bisect the two symmetrical 
halves of the curve. The probable error is represented by the 
letter r, and r= .4769363^. This is generally written r=^, where 
p=. 4769363. 

r, h, €, ly and r can all be calculated directly from the observa- 
tions. In general the values so calculated will not satisfy the 
above numerical relations exactly ; the correspondence depends 
on the closeness of the fit of the observations to the curve. 
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Section III.— To what Groups does Law of 
Error apply? 

Returning to our discussion on the relation between the laws 
of probability and the numerical facts of actual experience, let 
ma meudng US consider the meaning of such phrases as " a rare 
of taok. occurrence," "an improbable event," "a run of luck," 
" a lucky man," and similar expressions which show that some 
events are regarded as ordinary, others as extraordinary. On 
this subject there is a great deal of popular confusion; thus 
the Spectator opens its columns to people who write about 
extraordinary coincidences, e,g,y that on 3rd March in two suc- 
cessive years two persons of the same name died at the same 
age in neighbouring villages ; and recently the concurrence of 
the two names Arthur and Mallory in a dispatch was instanced 
as remarkable. Now, 4 priori, these two names are just as 
likely to be mentioned together as any other two borne by 
equal numbers of persons. If out of n persons, p bear the 
one and q the other, the chance that the first two names given 

in an assigned place will be these is about — x - ; but the 

chance that they will occur together in the newspapers in a 

given week is much greater, viz., -^ x N, where N is the 

number of pairs of names in conjunction in all the columns 
of the press together. Going a step further^ consider the 
number of pairs of names that, when placed together, would 
recall some event of historic or other interest, and suppose this 
to be M ; then* the chance that some such coincidence should 

arise in a given week is ^ X N x M, if we suppose for the 

sake of argument that / and q are the same for all the pairs 
concerned. From these remarks it will be seen that before 
we can speak of an event as extraordinary, we must define the 
time, place, circumstances, and nature of such events. Further, 
suppose we decide to regard an event as unusual if the chance 

of its occurrence thus defined is less than -, where r is a large 

number, it is easily seen that we may expect the improbable, 
to speak paradoxically; for great though r may be, the 
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number of events which come under our cognizance is also 
great; and we may therefore expect to find on an average 
one improbable event for every r we notice ; hence it is possible 
for a weekly newspaper with the help of the widely-extended 
search for sensations of its intelligence department to supply 
us week by week with its quantum of horrors. Another aspect 
of the same subject will be seen when we deal with the per- 
manence and regularity of certain small numbers in Section IV. 
The rarity of an event is often unconsciously determined 
by a mental forecast of its occurrence. If I take four cards 
TiiaidMof out of a well -shuffled pack and find them to be 
'"^^y- in succession, ace, king, queen, knave of hearts, I 
should feel surprised ; not because these four cards are less 
likely to come than any other four assigned cards whatever, 
but because I have certain associations with them in that they 
form a sequence which is valuable in certain games, and are 
the highest cards of a suit; there are noted in my mind 
unconsciously many groups of four cards of such special sig- 
nificance. If there are s such groups, the chance that one of 

s s 

them will occur is ^27^, if we do not, and ^^p-, if we do, regard 

the order of their occurrence. 

The real difference between a rare and a common event 
is, however, independent of any mental process or prejudices. 
If I place 8 coins one after another in front of me, it is no 
more unlikely that I shall get 8 heads than that I shall get 
any other assigned order of heads and tails, say htththtt; 

the chance of either is -g ; but it is much more unlikely that 

I shall get 8 heads than that I shall get 5 tails and 3 heads 

without regard to the order in which they come ; for out of 

(2^ = ) 256 possible arrangements, only i gives 8 heads, but 

/ 8! \ 

(-Y-|= ]56 give 5 tails and 3 heads. 

Apply this argument to our hypothesis as to great numbers. 

Suppose the population to be composed of males and females 

Th« greainest in equal numbers, and that 1,000 persons are 

im ^iM^uitT selected on some system quite unconnected with sex. 

dealt wiui by Out of 2^^(=ic^^) possible selections (differing 

til© uw of error. ^^.^^^ another only in arrangement in order of 

sexes) only i gives 1,000 males, but ^^^' ^ = lo*' x 27) arrange- 
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merits gives 500 of each sex, independently of the order. The 
chance of the first occurrence is — ^.* of the second about 



I0801» 

I in 37. In statistics we are concerned with the totals, depend- 
ing only on the combinatiofts of the items, not on their order 
(the permutations) ; and occurrences of the numbers near the 
average (500, 499, 501, &c.) are separately and much more 
conjointly very much more probable than occurrences of the 
numbers far from it The vast improbability of very great 
divergence can be seen by a numerical study of the curve of 
error (see p. 281). 

Hence the theorems relating to great numbers rest on a 
very much firmer basis than they would if divergence was 
due to that sort of coincidence which produces a so-called 
rare event. 

A " run of luck," good or bad, may be regarded as a suc- 
cession of improbable events, and is a more scientific expression 
A oommon than a " rare event " as commonly understood. Of 

fallacy. ^ great number of events, deals of cards, invest- 
ments, bets, and so on, very many will give normal results, 
average success at cards, normal returns to investments and 
so on ; very few will give abnormal winnings. The chance 
of abnormal success in one venture being /, a small fraction, 
the chance of a succession of n successes is />", very much 
smaller when n is at all large. It is in the phrase "lucky 
man" that the error is introduced. One who has benefited 
by the occurrence of a rare event may reasonably be called 
lucky, and the number of lucky men 'will be roughly proportional 
to the number of fortunate rare events ; but when a succession of 

events, say three, each of probability — , and conjointly of 



probability , or a broken succession {e.g,y ppqpp of 

which chance would be ) has taken place in one man's 

20000000/ ^ 

favour, the imagination loses the logic of the case, and sup- 
poses an overruling law, and marks out that particular man as 
not subject to the law of probability: one is apt to expect 
thaf the next event will also be a success, and to be 
further confirmed in this opinion by paying attention to 

* Chance of 1,000 m. or 1,000 f. is twice this. 
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the one instance when the sixth event is a success, and 
n^lecting the ninety and nine when it is a failure. Other 
pteople are biassed in the opposite direction, and have dis- 
tinctly too great an expectation of a counterbalancing tendency, 
a long run of failures till the average is restored. It is thus 
correct to speak of a man having been lucky, but tempting 
Nemesis to speak of him as a lucky man. It is a mere 
truism to say that, unless a success or failure have some 
causal influence over future successes or failures (as when a 
good stroke at a game steadies the nerves for another), the 
probability of each future event is totally unaffected by what has 
gone before. 

Let us return now to the method of deducing the chance /, 
and the index «, used in the expansion f^+y)^ from records 
Btatiittoai such as the death-rate. Notice first that the de- 
ooeffloiaBU. duction of p (the chance to be applied to each 
individual to find the varying degrees of probability of the 
possible totals) from the numbers, implies some hypothesis 
as to the genesis of these nunjbers, the very theorem which 
we wish it to illustrate ; for suppose that in the records of 
20 years we find 600,000 deaths in a stationary population of 
1,000,000, we assume that this is the most probable number 
which a chance regime would give, and since the most probable 

600000 T 1 

number is the total x A we deduce that^= x — = -^ ; but 

'^^ ^ 1 000000 ao 100 

here we are making some undefined assumption about the 
occurrence of events similar to that defined by the curve of 
error. If we actually assumed the law of error, we can calcu- 
late how far the value of p so estimated may be expected to 
differ from its true value. This accounts in part for the diver- 
gence between the calculated grouping and the fact 

Again, there is great difficulty in determining the number «, 
the number of persons to whom the chance of the occurrence of 
a particular event is applied, and we should further notice 
that in many cases, in particular in concrete measurements, 
such as height and normal length of life, we have no infor- 
mation whatever as to «, which in this case is the number 
of causes which may add or subtract undefined units from 
height or age; and we are often equally in doubt when 
dealing with great numbers, e^,, with the total value of 
imports. In such cases we should have to deduce both / 
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and n from the records of results ; and indeed it is simpler 
to fall back on other methods of deducing the law of error than 
the present one of regarding it as the limit of the binomial 
expansion, determining the modulus without any assumption as 
to the number of independent causes. Hence in a great many 
instances we cannot expect to find close conformity to a pre- 
determined curve (^-f-^)°. Similarly we can deduce from the 
laws of gravitation and motion that a planet's orbit must be 
an ellipse, but cannot determine the eccentricity of this ellipse 
except by observation. 

A far-reaching cause of the apparent discrepancy between fact 
and theory is, however, of a different kind. The theory applies 
to experiments performed under unchanging con- 



apparent nan- ditions ; if we are drawing differently coloured 
JaS^SS^oT^ .balls from a bag containing a great number, all 
' the external circumstances must be unchanged, and 
the only variation that which comes from the so-to-say r^fulated 
randomness of the forces which decide shuffling and drawing. 
Now in human affairs, when we consider a series of death-rates 
or any other rates distributed in time, we are dealing with a 
constantly changing environment of social and sanitary habits, 
within which the apparently random forces that decide death 
are acting ; and these external changes may affect the inter- 
action of the random forces, just as a change in barometric 
pressure may affect the molecular forces of a rigid body. Such 
effects cannot be foretold or calculated ; we may expect that 
improvements in sanitation will diminish the death-rate, but 
some detail may increase it ; vaccination may diminish small- 
pox, but increase the liability to some other disease. To such 
reasons as these should be assigned the non-correspondence to 
the law of error of great numbers distributed in time. When 
the element of time is eliminated by a process of random 
averaging the correspondence is closer. Great numbers distri- 
buted in space are exempt from this disturbing cause and might 
be expected to show closer correspondence ; for instance the 
birth-rates in a number of districts might be expected to con- 
form more closely than rates for one place for a series of years ; 
but it is very difficult to obtain sufficiently homogeneous 
figures distributed in space; though Prof. Lexis gives some 
instances of this kind.* 

* Massenerscheinung^ p. 66. 
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Physiological and anthropometrical measurements, such as 
the heights of 10,000 children of the same age, are not affected 
by these difficulties, and should show close correspondence with 
the theoretical distribution ; and it is not surprising that the ratio 
of the number of male to the number of female births, depending 
as it does on hidden causes not easily influenced by the progress 
of civilisation, should show that remarkable consilience with the 
law of error, which has so often been remarked. Finally, the 
occurrence of sequences and groups of numbers, such as those 
obtained from logarithmic tables, being absolutely independent 
of changes in time or space, naturally show complete agreement 
with theory. 

All these considerations make the application of the law of 
error to actual measurements a very delicate operation, and it 
Tiie UM of the nriay appear that the cases where agreement is 
lAw of error, close are so few as to make the whole body of 
theory useless; but this is an unscientific view to take. The 
general process of applied science is to frame hypotheses as 
nearly consistent with the facts as is possible without such com- 
plications as will prevent their use, and then apply to the 
idealized case the corrections which the actual cases necessitate. 
This process has led to the best results in physical science. In 
the problems dealt with by the law of error, it will be found that 
many deductions from the idealized cases hold also when applied 
to the only partially corresponding records of great numbers ; 
just as, in mechanics, many theorems relating to smooth bodies 
can be applied unchanged to rough bodies. For instance, the 
" fluctuation " of non-corresponding figures can be calculated by 

the formula ; and the accuracy of an average of random 

n 

samples of quantities not grouped according to the curve of 

error varies as the square root of the number of samples taken.* 

From this discussion we may gather that we can seldom tell 
d priori whether the law of error will or will not apply to a given 
series of figures. This must be determined by experiment for 
each new class of records ; but when we have found correspond- 
ence in many series of a class (as is the case in measurement of 
heights) we may proceed with confidence to apply the law to 
other similar series or groups. 

An important distinction is drawn by Prof. Lexis f and em- 

* See p. 308. t Ibid.^ p. 28. 
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phasised by Prof. Edgeworth * between two classes of figures to 
Gonoiete mea- which the laws of great numbers apply. The 
■nremtnts and first, called by Lexis concrete, contains such quan- 
grea nvm rs. ^j^j^^ ^^ height measurements of a great number 
of persons, and normal length of life, where a definite mean or 
type seems to be normal and other measurements to be varia- 
tions from this type. In these cases it is not easy to connect 
the facts which correspond with the exponential curve ^ = ^~'^, 
where x is the divergence from the type, with our deduction from 
the limit (n infinite) of (/+^)^. Suppose, however, that height 
is determined by n forces, each capable of adding or of subtract- 
ing I unit, say i millimetre, from normal height, and that the 
chance that each shall act is p ; then the divergencies obtained 
in a number of individuals should be distributed according to 
the coeflficients in this expansion. 

The other class, called by Lexis comdinational, to which the dis- 
cussion in Section II. above more directly applies, contains those 
totals which are the sum of a great number of items (persons, 
deaths, births, &c.), for the existence of each of which a definite 
chance, p, can be assigned d priori. The numbers may then be 
expected (subject to the disturbing causes already discussed) 

to be grouped in accordance with the curve _;/= —-=r e , 

Jrirpqn 

where n is the total numbers of persons to whom the chance p 

applies. In such cases/ is the arithmetic average oi p^p^ , , . 

/„, where p^Zy p^z^ . . . p^ are the numbers of events which are 

found respectively in n series each of z observations. / is the 

" probability coefficient " of the event, and /j, /g* • • • A should 

conform to a curve with modulus — BciLLc. On the other 

z 

hand, if c is calculated from the formula — 

we are treating the fs as " concrete " quantities, and obtain a 
second value for the modulus. 

If y^llzZ)-.. A'^) = 0, the distribution of the co- 

eflScients/p/g, . . . /„ is normal, which is not often the case. 

* Jubilee Volume oi Statistical Journal^ p. 191. 
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If this quantity <0, the coefficients are grouped more closely 
together than the theory of error leads us to expect, and there 
is some evidence that a force preventing divergence has been 
called into play. 

More generally this quantity is >0, the coefficients are more 
divergent than in accordance with the theory of error, and some 
disturbing forces have acted. 
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THE PERMANENCE OF CERTAIN SMALL NUMBERS. 3OI 

Section IV.— The Permanence of Certain Small 

Numbers. 

A remarkable side-light is thrown on our general argument 

by the actual permanence of small numbers. Little attention 

Theunomiai ^^ been given to this phenomenon, but it is a 

oxpaniion and very Striking fact that it among a great number 

smau numbers. ^£ items there are a few which present some 

particular feature, it will be found that this small number is 

seldom much exceeded and seldom entirely vanishes. 

The following numerical example shows that this may 
be expected theoretically, and an examination of the successive 
terms of {p+qf when / is very small and q nearly equal to i 
will show the same phenomenon more generally. 

Constancy of Small Numbers. 

(I I000\«<» I / \*^ 

looi 1001/ iooi*"» \ / 

1st tenn = 1000*^ -r looi*^. 

^ Viooi/ "^ ^"^^ ^3 - 3000434O = - 1.7364 = 2.2636 = log .018348. 

Chance of 
No occurrences ist term = tf, suppose - - = .0183 

1 „ 2nd „ = 400oxiooo»w4.iooi*»«=4fl- - - . = .0734 

2 „ 3rd „ =i552i^xiooo»«-iooi^ 

= 400ox(4000-i) ^ ^00^^ xooi^ 

1.2 
= 8 X 1000*"^ -r iooi**», correct to i in 4000= 8a = .1467 

3 „ 4th „ =4000x3999x3998^x000^^,00^^ 
•^ t »f J 2.3 

_ 4000 x(40oog- 3x4000) ^ jooow^-i. iooi««> 

, '-^-^ 

= — ^ — X 1000*'** -r looi***, less 3 in 4000 approx. = . 1956 

= T^ tf, less 6 in 4000 approx. - - - - = .1954 

= ^ fl, less 10 in 4000 „ - - - • = . 1562 

= ^ fl, less 15 m 4000 „ ■ ■ ■ - = .1040 

= ^ fl, less 22 in 4000 „ - - - - = .0593 

= \a, less 30 in 4000 „ - - • • = .0296 



4 




Sth 


5 




6th 


6 




7th 


7 




8th 


8 




9th 



loth - - - .0131 

nth - - - .0052 

I2th - - - .0019 

13th • .0006 

14th - .0002 

Terms 15 to 4001 together only occur about i in loooo. 
7 
Chance of 3, 4, 5, or 6 occurrences = — approx. 
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To take an actual example : — Out of some 530,000 deaths 
annually from all causes the following are the numbers from 
splenic fever in the years 1875- 1894 : — 

5, 4, 10, 14, 12, 18, 9, 15, 8, 18, II, II, II, 12, 7, 4, 3, 6, 7, 10. 
Average 10. 



Here/ = 



10 



^ = 



52999 



530000' ' 53000 
n is doubtful, and may be taken to be the total number 
of deaths or the total population ; but it will be found that 
the following numbers are unaffected, whichever number we 
adopt. 

The successive terms in the expansion (/ + f)'^'^^"''^ are given in the second column. 



Chance of 




deaths is - 


.000045 


I 




.00045 


2 




.00225 


3 




.0075 


4 




.0185 


5 




.037 


6 




.061 


7 




.087 


8 




.11 


9 




.12 


10 




.12 


II 




.11 


12 




.09 


13 




.07 


14 




.05 


15 




.03 


16 




.02 


17 




.01 


18 




.00 


More than 18 






Number of occurrences 
to be expected in ao years. 



Number 
observed. 









Considering the small number of years taken, and the in- 
definiteness of many of the death returns, the general consilience 
between the last two columns is satisfactory ; while the general 
principle that small numbers show a certain constancy is well 
exemplified. Specialists in all professions, from the doctor who 
treats only one obscure disease of the ear, to the dealer in 
curiosities, make their livelihood dependent on this permanence 
of small numbers. 

The regular occurrence of accidents and of improbable events 
in general furnishes other examples of the same sort 

Note, — Since writin|f this section my attention has been called to a 
treatise by Dr Bortkewitsch, Das Gesets der Kleinen Zahlen^ Leipzig, 1898, 
where the close agreement of the records of accidents and other occasional 
events to the binomial expansion is dealt with in a more exhaustive and 
analytical manner. 
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Section V. — Extension of the Law of Error and 
Applications. 

We have only shown so far that great numbers fluctuate 

about their mean in accordance with the law of error on the 

o«neraiized sissumption that for the existence or non-existence 

sutemAntof of each particular unit there is the same numerical 

law of error, chance /. We can, however, prove by elementary 

methods that the same distribution is reached under many other 

circumstances, and at the same time make several important 

deductions. 

Suppose that a quantity whose mean value is H is deter- 
mined by the action of a great number of causes ; let the causes 
produce deviations fj, ^2» • • • which are connected with ?y, the 
corresponding deviation of H, by the equation ^ = ^i€i 4-^2^2+ +> 
where a^j a^, , . are constants. If each of the deviations c^, fg . . . 
can be of various magnitudes, the curves which show the pro- 
babilities of the occurrence of these magnitudes are called 
" curves of frequency," or " facility curves." If the curves of fre- 
quency are normal curves of error, the chance of the occurrence 

of the deviation c^, eg • • • are proportional to ^ ^', ^ *^^ " ' ' 
where q, ^g • • • ^^^ the moduli of these curves. The following 
proof holds when these assumptions are justified ; but the result- 
ing theorems hold (i) when the €*s belong to any curves of fre- 
quency such that their limits are narrow, while the number of 
€'s is great, and the limits of each of the €*s is small compared 
with f\y and none of the a*s are predominant ; (2) where 17 is any 
function oi {a^€^-\'a^^-\r . . .)> such that ^ = ^^€^4-+ is a first 
approximation. * 

The equation to the normal curve can also be deduced 
directly from other considerations, when we are dealing with any 
quantity liable to continuous small independent variations.! 

We will now show that when the assumptions are limited 

♦Adapted from a paper by Professor Edgeworth in the London and 
Edinburgh Philosophical Magazine^ vol. 34, 1892, p. 429. 

+ See, for instance, Chauvenet's Astronomy, vol. ii., Appendix ; and 
Merriman's Method of Least Squares, chap. ii. 
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as above that 17 belongs to a normal curve of error with 
modulus ^{cli(\+a\c\+ ). 

Case I. — If H is determined by two causes only, 1; = a^^^ + tfjCj. 

Probability that the deviations c^, Cj concur = Ce ^^xe ^* (where C is a 
constant) = (eliminating e^) 

_/« Mi|->i«i)n n« «fcg+a|cf / _ a, c? ly \« 

Q^ Uf»- ajci f :^ C.tf"(»fcf+a|ci)xtf aj cf ci -V afcHaJd/ 

This quantity is the probability that a deviation €j occurs with a 
deviation tf. Giving Cj all its possible values in tum, we have pro- 
bability of a deviation 

^* <1 = °° ajcf+a fcj-/ - •.F?^_V 

1; = C.<r X J2!^ ^ 

€j = — cx> 

Now the quantity included in the summation is the whole of a 
normal curve of error, which has been shifted through a horizontal 

distance 2 2I — 2~2' ^^^ ^^^ value depends only on the constants a^, 
^} ^1* /2) ^^^ ^^ independent of ^ and c^. 

Hence probability of a deviation 1; = ^ a,»ci«+a,»c,» ^ constant, and 
rj belongs to a normal curve of error with modulus Ja^c^ + a^c/^ 

Case II. — If H is determined by three causes, 1; = « i^x + tf3<2 + ^8*8> 
write i/i for a^€^ + a2*2» ^^^ ^ = '/i + ^8*8- 

By theorem of Case I., modulus for t/^ is Ja^c^ + a^^^- 
and modulus for 17 is 



Similarly the theorem can be extended to any number of causes. 

Corollaries. 

I. Let X be the weighted average of the quantities x^^x^ . . . with 

— ^^x 
weights apaj..., so that x^ ^^ ; 

suppose that the weights are known accurately, but that a:i = jt:} + €j, 
oTg = ^ + €2, . . . where x\x\ ... are correct values, and €^,€3 . . . errors be- 
longing to normal curves with moduli c^^c^,.. Then if x^ be the 
correct value of x^ 

- ^ix^-7-t) -1 , 2a€ 



xa 
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Hence the modulus of x is '^A^^U — ??f? — LLii, for each term in the above 

. ^1 + ^2+... 

theorem can be divided by the constant a^ + ^2 + . . .* 

2. Putting «! = a^— a^=^, and Ci= c^^ c^=^ =c, we find as before 
that modulus of an unweighted average of n quantities, conforming 

JfT c ^ 

to a curve with modulus f, is — — — = ~p. 

n "Jn 

3. If H is the difference between two quantities whose mean values 
are the same, and moduli c^ and c^s 

% the deviation in H, = c^ - Cg ; modulus for ^ = V^J + ^T" 

4. If the two quantities are the averages of n^ and Wg quantities with 
moduli Vy, v^f then by corollary 2, c^ = — 1-, c^ - — ?r, and modulus for 

difference between the quantities is by corollary 3, . / — + — . 

5. In particular, the modulus for the difference between two quan- 
tities from one group, modulus c, is c Ji, 

Corollaries 3, 4, and 5 can be proved directly by the method 
of Case I. 

Precision of an Average. 

The second corollary, that the precision of an average is 
Pradiionofan proportional to the square root of the number 

averaga of terms it contains, is so important that . an 
independent proof may be offered, starting from different 
assumptions. 

Suppose a great number {m) of observations to be made 
of a single unknown quantity, ^^., the declination of a star. 
Let r be the "probable error" of a single observation, h the 
precision of the group, v the true, v+d^, v+d^ , , . v+d^ the 
observed values of the quantity. Let Xq be the arithmetic 
mean of the m observed values, and let Xo = v + 8. Then 8 is 
the error of the arithmetic mean. 

Let Sj, Sg, . . §„ be the differences between the observed values and 
jCq, so that v+d^ = XQ-^8^:=v-^8 + 8^; ^^ = 8 + 8^, ^2 = ^ + ^2* &c. ; also 
^0 = 2T K + ^1) = »^o + ^A'^ and 2^8i = O. 



* This should be compared with pp. 204-214, supra, 
U 
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Let Pi be the probability that this sel of observations concur. Then 
p^^Ar*^'^? xA.-^^-^^i X to « products =^.-^"<^'+^*^ > 

^— -b»xr («+«,)« A^ -h«|m«»+9«2f(«») + sV*(«?)} 



V 



.7 



= -ii.<f X ^ I Since 27 (oi) = O. 

?! is the probability that the observed values yield an error 8 in their 
arithmetic mean. Let P^ be the probability that the observed values 
yield no error in their arithmetic mean, then 

_ ^ -h«2»(«l) 

Pj = P^ X ^-»»'"*" 
Hence the arithmetic mean belongs to a curve of error whose pre- 
cision is V^^ = // Jml and therefore its probable error is — p . 

If the errors d^^ d^... occur /^/g ••• ^nies respectively in the observa- 
tions, while /i + /2 + + /m = «> the foregoing argument is unaffected, 
and the precision of the mean is A Jn^ that is h J{p^ + A + + /■)• 

Care is needed to distinguish the hypotheses on which this 
formula, and the former formula -^4— connecting weights, 

depend. 

A corresponding result may be obtained directly from the 
limit of the binomial expansion. If an experiment, for whose 
success the chance is /, is performed n times, the most probable 
number of successes is the nearest integer to pn, and the 
modulus for the various numbers is J2pqn, The modulus for 

the average of the n experiments is therefore d^i^ = -^^^^ ; that 

^ »Jn 

is, the precision is proportional to Jn. 

We can now obtain a formula for the modulus of a series 

of observations in a form often given. On p. 285 it is shown 

that if 5p 83 . . . 5„ are divergencies from their average of a 

series of observations, and if these divergencies conform to a 

law of error with modulus c, then c should be taken as ^^ — 

Digitized by VjOOQIC 



EXTENSION OF LAW OF ERROR AND APPLICATIONS. 307 

and the centre of the curve at the average, for maximum proba- 
bility. Now the average from which these divergencies are 

measured conforms to a curve modulus — ^, where c^ is the 

modulus of the divergencies measured from their true value, not 
from their arithmetic mean ; 

then, if A is the divergence from the true value, modulus c^, 

S is the divergence from the arithmetical mean, modulus c, 
d is the divergence of the arithmetic mean from the true 

value, modulus -~z^ 
A^B + d 



and ^1* = ^ + ^, from page 304. 
fi 

Hence ^1^ = - 



n 
«- I «- I " 



Since n is large these quantities are very nearly equal ; and 
it is not worth while here to discuss their relative merits ; the 

latter quantity ^ — — is generally used.* 

As an example of this greater precision of averages, take 
the aijerages given on p. 289, each of 30 numbers, which 
range on a normal curve modulus 5 ; these averages are 25.9, 

25.7, 24.4, 24.4, 24.5, 24.9, 24.8, 25.7, 24.5, 25.7. General average 
25.05; differe nces, . 85, .65, -.65, -.65, -.55, -.15, -.25, .65, 

— -SSj 65; Y ^~o~~~'^9' ^^^ modulus for such groups is by 

theory -4==.9i ... 
n/3o 
From the same page we may find the following 30 averages, 
each of 10 numbers: — 23.6, 24.8, 24.4. 24.6, 25.3, 23.6, 25.7, 

25.8, 26.6, 26.0, 24.5, 25.0, 24.7, 26.9, 25.5, 23.8, 24.3, 24.3, 24.0, 
23.8, 244, 26.6, 24.4, 25.5, 25.6, 26.8, 24.0, 26.2, 25.6, 25.0. 

The probable error for these is by theory .47 of -4= = .642 ; 

and between the limits 25.04 ±.64 we actually find 15 out of the 
30 averages, while 6 are below the lower, and 7 above the higher 
limit 

♦ Vide article by Prof, Edgeworth, Camb. Phil Soc. Trans^ i535. 
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The modulus for the whole 300 is a^J — ^ — - — 5- =,29 and 

the probable error .14 ; the average for an infinite number would 
be 25 ; for the 300 selected it is 25.043, that is well within the 
probable error. 

Examples of this kind could be multiplied indefinitely. 

Samples. 

The bearing of this principle on the method of sampling 
is very important. Our experience on most subjects is derived, 
not by examining all the existing examples, but 
by noting a few which come in our way. A man 
of specialized experience is one who has seen and analyzed 
mentally many cognate phenomena. It needs no proof that the 
more samples taken, the more accurate will be the judgment 
formed about the group of which they are samples. Very many 
business transactions are decided by such an examination. Now 
we have seen that the precision of the average shown by samples 
of quantities which satisfy the normal law of error is inversely 
proportional to the square root of their number ; but there are 
three further questions to consider — (a) Whether this rule applies 
to samples of quantities which do not conform to the ^aw of 
error, that is, which would not be obtained from a normal distri- 
bution without great improbability ; (/3) how we are to measure 
the precision of either the original group of which we have 
samples or of our samples ; (y) whether we can learn anything 
more about the original group besides its average. 

a. On referring back to page 303, it will be seen that the 
averages of samples of, say, m quantities drawn at random from 
a large group whose distribution is not normal, will, if m is 
large in relation to the fluctuation of the original group, satisfy 
the law of error. The reason, apart from the mathematical 
analysis of this, is clearer from the following illustration : if 
we have records of a quantity, which fluctuates in accordance 
with the normal law about an average which changes slowly 
year by year, our measurements will not conform to the normal 
law ; but if we select four years at random again and again, 
we shall eliminate the influence of time, and our samples will 
tend to conform. Readers may experiment on the annual birth- 
rates to illustrate this. 
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The following numbers are the death-rates per 10,000 in 
London registration districts, arranged in order of magnitude . — 



70 100 113 120 130 
70 107 115 121 130 
So 108 115 121 131 
92 108 115 123 132 
109 116 123 132 

117 124 132 

118 125 133 
118 126 135 

126 136 

127 137 

128 138 
138 
139 
139 



141 150 160 170 181 
141 150 163 177 183 
141 150 164 178 183 



142 151 166 

144 151 167 

144 152 167 

144 152 168 

144 153 

145 154 

145 155 

147 155 

148 156 

149 156 

158 



185 
188 



191 204 230 252 323 

194 205 236 252 329 

198 210 237 255 329 

198 21 X 238 264 404 

220 - 264 448 

222 266 475 

222 276 505 

223 284 622 
228 286 625 

1,408 



These numbers clearly do not conform to the normal curve. 
We will omit 1,408 as being so far from the others as to be in 
a class by itself and select at random samples of 4, 18 times. 
Their averages are 174, 222, 226^, 221, 129, 150, 181^, 193, 300, 
133, 216, 178, 167, 169J, 183, 150, 227, 164. Average, 188; 
modulus, 57.4. These fit a curve of error closely, thus — 







Calculated from 




Observed. 


Table on p. 281. 


ithin 5 of average 


2 


1.7 


» 64 


3 


2.3 


» 10 


4 


3.5 


>. 14 


5 


5.5 


„ i8i 


6 


6.3 


M 21 


7 


7.1 


„ 24 


8 


8.0 


,. 28 


9 


9.2 


» 33 


10 


10.5 


n 34 


II 


10.7 


„ 38 


12 


"•3 


n 38i „ 


14 


II. 8 


n 39 


>5 


11.9 


» 55 


16 


14.8 


» 59 


^l 


16.7 


», 112 


18 


18.0 



Thus the theorem is confirmed in a very unpromising case. 

p. To determine the precision of the average of our samples, 
two methods are open. The first consists in finding the modulus 

^=a/— — of all the quantities chosen; then if the quantities 
conform to a normal curve the modulus of their average is 
- ;_= a/-^ — V, and the precision is -^l? ; if the quantities do 
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not conform this formula still gives the best measure of the preci- 
sion, but it may be well to confirm it by the second method. This 

method is to break up the n samples into - smaller groups each 

of w, and see if the averages of these groups are such as would 
come from a normal distribution ; if they do not, increase m ; if 
they show signs of normal grouping in a curve of modulus c, before 
we have come to the limiting value of w, then we may expect that 
the larger sample of n things belongs to a normal curve, whose 

modulus is —¥-, which may be expected to be equal to —p. 

/« ^/« 

V « 
If we do not get conformity with the largest value of m we 
can take, we have no guarantee that n is large enough to 
eliminate the abnormality of the original figures. 

The following statistics of wages give a practical application 
of this principle, 

vvoMiioAi In the period 1834-45 inquiries were made in 

•"■■i^- the Scotch villages as to the day wages of agri- 
cultural labourers. 

The resulting figures for the Lowlands may be tabulated 
as follows : — 



Numbers at 13d. 


I3W. 


I4d. 


i5d. 


i6d. 


i6id. 


i7d. 


i7ld. 


i8d. 


i8id. 


S 


3 


2 


8 


12 


6 


24 


3 


39 


3 




Numbers at I9d. 


20d. , 


2ld. 


22d. 


23d. 


234d. 


24d. 


24id. 


2Sd. 


27d. 


27 


26 


27 


IS 


I 


I 


4 


I 


2 


2 


Average, i8.8< 


i ; modulu 


s, 3.62d. = ^ 












( 


CORRESPOND 


ENCE WITH 
Above 


Law < 

Dbserved 


DF Er 
Below 


ROR. 






Limits. 


Normal. 


Average. 




Average. 




Total. 




18.8 ±lr 


46 


27 


+ 


3 


= 


30 




ic 


90 


S3 


+ 


4S 


= 


98 




ic 


127 


S3 


+ 


69 


= 


122 




i^ 


156 


80 


+ 


87 


= 


167 




c 


178 


9S 


+ 


87 


= 


182 




ic 


192 


96 


+ 


9S 


= 


191 




l^ 


201 


97 


+ 


97 


= 


194 




i^ 


206 


102 


+ 


100 


= 


202 




2c 


210 


104 


+ 


loS 


^ 


209 














. 
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When we divide the returns into 50 samples of 4 we get 
modulus for their averages 1.8 ; 25 samples of 8 give modulus 
1. 14; 40 samples of 5 give modulus 1.57; 20 samples of 10 
give modulus 1.19. 

The c for the original samples may be found from any of 
these ; the results are — 

Modulus of original samples - - 3.62 

„ „ calculated from the groups of 4, 1.8 x V4 =3.6 

„ „ „ „ 8, i.i4xV8 =3.2 

»> » »> S» i-S7xV5_ = 3-5 

„ „ „ „ 10, i.I9xn/io=3.8 

This is a close consilience with theory. We will adopt 3.6 
as the value of c^ then the modulus of the average of the 2U 

original samples is "~,^=i its precision — — -, and its probable 
error .47 of -i==-i2 . . . , or J of a penny. 

V2II 

We should verify that the samples conform to the law of 
error: the following shows the comparison for the samples 
of 4: — 

Observed 







Above 




Below 






Limits. 


Normal. 


Average. 




Average. 




Total 


;.8±i of modulus (1.8) 


II 


6 


+ 


7 


= 


13 


T >i 


21 


7 


+ 


II 


= 


18 


f » 


30 


14 


+ 


17 


= 


31 


T »♦ 


37 


14 


+ 


23 


= 


37 


modulus 


42 


18 


+ 


24 


= 


42 


% of modulus 


45 


20 


t 


26 


= 


46 


T » 


48 


n 


+ 


26 


= 


49 


T >» 


49 


23 


+ 


26 


= 


49 


T >> 


49 


23 


+ 


27 


= 


50 


2 modulus 


SO 


23 


+ 


27 


= 


50 



This resemblance is as close as the argument requires. 

7. If our first samples conform to the law of error we know with 
reasonable certainty the average and the distribution of the original 
quantities — namely, that they conform to a normal curve with 
approximately the same average and modulus as our samples. 
The general average and the sample average differ in accordance 

with a law of error, modulus — p, where c is modulus for samples 
and n their number. 
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If our first samples do not conform, it is still probable that 
their curve of frequency has a resemblance to that of the original 
quantities. If the fraction p^ of the original quantities lay 
between assigned limits a^ and a^y then the number to be 
expected between those limits in n samples is decided by the 
expansion of (A+?i)"> where A + ?i='I *^^ most probable 
number is the integer nearest /,.», and the modulus is 'J2p^q^n ; 
similarly if p^^ ^j bear a similar relation to a^, a^y the most 
probable number selected between these limits is p^n, and 
modulus »j2p^^y and $o on. Thus a similar distribution may 
be expected, and each part of it has a precision varying jointly 
as the square root of the whole number taken and the quantity 
Jpii—p)) thus the larger the number taken the greater will 
be the resemblance, and [since Jpiii—pi) > \/A('""A) when 
Pi > A ^"^ A+A ^ ^^^ larger the altitude of the area in the 
curve of frequency corresponding to given limits the greater its 
precision. The errors of the various divisions are not, however, 
entirely independent of one another. This is, of course, in 
strict accordance with the common sense of the question. 

The following examples of school ages illustrate part of 
this argument. In a school containing 257 boys of varying 

Hnntriofti ages, where the dispersion was not likely to be 

•"•"P**- normal, 48 were selected at random and their 
ages written down. 

The modulus of the 48 samples is 43.2 ; their average 13 
years 10 months ; their distribution as follows : — 

Observed 







Above 




Below 








NormaL 


Average. 




Average. 




Total. 


\ modulus 


II 


3 


+ 


6 


= 


9 


5 >i 


21 


10 


+ 


10 


= 


20 


6 9* 


29 


12 


+ 


16 


= 


32 


S *f 


36 


13 


+ 


21 


= 


34 


modulus 


40 


16 


+ 


22 


= 


38 


T »» 


44 


19 


+ 


24 


= 


43 


T f> 


46 


20 


+ 


26 


= 


46 


T » 


47 


22 


+ 


26 


= 


48 



The observations are not grouped symmetrically nor in close 
agreement with the normal distribution. 

When we take the average of random samples we do not 
find close relation to the normal curve till the number of 
samples becomes too small to work with. Hence we have no 
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choice but to assume that 48 is a large enough number of items to 

neutralize the want of symmetry in the figures. The average of the 

whole group is as likely as not to be within the limits 13 years' 10 

43 i_ 

months + — x .47 months, that is between 13 years 7 months 

and 14 years i month. 

Again the quartiles in our samples are at 18 months above 
and 2 years below the average ; the quartiles in the original 
group may be expected to be within the same distances with 
probable errors -s/2 x ^ X f of 48 = 4-2 months, since the chance 
that any quantity shall be between the average and the lower 
quartile is J. 

From a census of the whole school it was found that all 
these conditions were fulfilled ; the average was 14 years ; the 
quartiles were unfortunately not kept ; but 58 boys out of the 
257 were stated to be over 15 years 9 months, from which it 
is highly probable that the upper quartile was within the given 
limits, 15 years 6 months ± 4 months; and 54 were below 11 
years 10 months, which places the lower quartile also well 
within the limits. 

The principle of corollary 4, the modulus of a difference is 

most useful in comparing two groups selected as having certain 

Preoiiion of a qualities. Thus Professor Edgeworth * discusses 

diffareaoe. whether an ascertained difference of 2 inches 
between the average heights of a large number of criminals and 
that of the general population is significant ; and finding that 
the modulus for the difference between two random groups is 
only 0.08, holds that there is a cause of the difference in the 
method of selection ; that is, that criminality and low stature 
are found together. We might apply the same principle to the 
investigation of the existence of a period in any figures ; for if 
the modulus of the figures was c^ the modulus for the difference 
between the averages of two random samples of 20 months each 

would be ^a/^o+2o'^\7^ ' ^^ ^^ difference between the averages 
of the figures for 20 Decembers and 20 Junes was 3 times 
this quantity the existence of a period would be established. 
For instance, in the percentage of ironfounders unemployed 
monthly from 1855 to 1874 f the modulus for single months 



♦ Statistical Journal^ Jubilee Number, ibid. 
t See p. 179, supra. 
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IS about 30, and for the difference between the averages of 
two groups one of 20 and the other of 240 is therefore 

^+~Q=7 ; but the average of the 20 Decembers is about 

29 7. above the general average, a significant difference ; and 
the average of the 20 Augusts is about 19 7o below, a diver- 
gence smaller than before, but still significant; the difference 
between the Decembers and Augusts, namely, 48, is to be 

compared with the modulus 30 x^^+~ = 9, and is therefore 

significant 

A final example may be given which brings into relation 

many of these theorems. The following were the recorded 
o«iitni times for "The Oaks" from 1850 to 1899; we 
ezaiii]d«. ^iu discuss whether there has been a significant 

increase of speed, or some change in the conditions of the race, 

or whether the fluctuations are due to minor causes varying year 

by year. 

min. sec min. sec. min. sec. oiin. sec. min. sec. 

1850—2 56 1860—2 $6 1870—2 $2 1880—2 49 1890—2 40t 

185I— 2 52 1861— 2 44 187 1— 2 51 1881— 2 46 189I— 2 S4f 

1852—3 o 1862—2 49 1872—2 52 1882—2 49 1892—2 43i 

1853—2 52 • 1863—2 54 1873—2 sol 1883—2 53^ 1893—2 44f 

1854—3 o 1864—2 47 1874—2 48^ 1884—2 49 1894—2 50 

1855-2 58 1865-2 51 1875-2 49^ 1885-2 43! 1895—2 4&t 

1856—3 4 1866—2 53 1876—2 50 1886—2 54| 1896—2 451 

1857—2 50 ^ 1867—2 54 1877—2 54t 1887—2 sof.^ 1897—2 45 

1858—2 53i 1868—2 47t 1878—2 54 1888-2 42* 1898—2 4Sf 

1859—2 55 1869—2 59 1879—3 2 1889—2 45 1899—2 44 

Ten yeftfly 
av^agcs 2 56 2 51^ 2 52) 2 48 2 47 

These figures fit fairly closely a normal curve of error with 
modulus 7.43 sees., average 2 min. 50.87 sees. The modulus for 
the difference between two is therefore 7.43 ^1 + 1 = 1048 sees 
The greatest difference between consecutive years is 14 sees., 
between 1856 and 1857 ; this is not sufficiently far beyond the 
modulus to make it uncommon ; hence there is no proof of any 
sudden change in arrangements having taken place between two 
races. The difference between the times for years early in the period 
and those later is sometimes as much as 20 sees. The modulus 
for the difference between the averages for two periods of 10 years 

7.43^^+^=3.3. The difference between the averages for 
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1850-59 and 1890-99 IS 9 sees., which is significant; that 
between 1850-59 and 1880-89 Js also significant. The odds 
against such a difference as that between the average times of 
1850-59 and 1860-69 are only 13 to i, not very significant. 
Hence we find that some cause was at work which gradually 
quickened the race between the fifties and the eighties. 

This method can be applied to the criticism of such serial 
figures as birth, death, and marriage rates, imports, exports, and 
Avpuoauon to ^^ ^"* With a periodic scries the method can be 
MTies of used first for establishing the period, and then for 
oiffazmtoiaaMf. investigation of the figures found when the periodi- 
city is eliminated. With a symptomatic * curve, the method can 
be used for measuring the symptomatic tendency, and then for 
studying the short-period fluctuations. For a series which has 
no symptom and no period, the method is at once applicable 
for finding what divergencies are significant, and for forecasting 
and interpolating numbers. Without some machinery of cal- 
culation of this kind we are unable to get beyond vague and 
general impressions of the existence of a change; f but if we take 
care that the conditions of the calculation are satisfied, we can 
by the method now developed make a definite statement quite 
independent of personal bias, such as "either an event has 
happened, so improbable as to be outside the range of human 
experience, or the decrease shown in the series of figures in 
question is due to some significant change in the system of 
causes which produce them." 

* Sec p. 240, supra, 

t We can take an intermediate step by noticing in the above table that 
in ^i£rcases out of ten the times in the decade 1880-9 are less than the 
times thirty years earlier ; the chance that so great an agreement in the 
direction of the change (irrespective of its magnitude) should come in a 
random selection is tvIt or .0215 ; the chance as calculated above is .0006. 
See Edgeworth xn Jubilee Volume of Statistical Journal^ pp. 213-217. 
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Section VI.— The Theory of Correlation. 

It IS never easy to establish the existence of a causal connec- 
tion between two phenomena or series of phenomena; but a great 
ommj deal of light can often be thrown by the applica- 
oomieotton. ^Iq^ of algebraic probability. We have already 
dealt with some cases in point ; we have shown how to find 
whether an event is due to a special cause, or whether it 
naturally arises from the variation of existing causes ; we have 
shown how to measure the significance of the difference between 
two quantities or two averages ; and further, we have investigated 
such problems as the influence of the seasons.* In many large 
groups of phenomena we can apply a more refine<^ and more 
certain method, which it is our object to introduce in this 
section. When two quantities are so related that the fluctua- 
tions in one are in sympathy with the fluctuations of the other, 
so that an increase or decrease of one is found in connection 
with an increase or decrease (or inversely) of the other, and the 
greater the magnitude of the changes in the one, the greater 
the magnitude of the changes in the other, the quantities are 
said to be correlated. Correlation is a quantity which can be 
measured numerically; and its measurement has been the subject 
of much recent mathematical investigation. 

Let two variable quantities X, Y be subject to variations x,^, 

TiMoorroUttfon which are due to a multitude of individually unim- 

■uteoo. portant causes, producing fluctuations e^y ^2 • • • *i ^2 

... so that the jit's are connected with the is and theys with 

the €*s by the equations. 

j; = «! ^1 + tfg ^2 + + ^« ^n 

^ = ^j €j + K ^2 + + ^n «n» whcrc flj, flg • • • ^p ^2 ' * ' ^^ constants. 

Then x and y conform to normal curves of error, whose 
moduli we will call ^j, c^. 

The rest of our investigation which is based closely on 
Professor Karl Pearson's paper on " Regression, Heredity, and 
Panmixia," t proceeds on the assumption that the ^'s and c's 



* See p. 1 86, supra, 

t Transactions of the Royal Society^ vol. 187 (1896), A. 253-318. 
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conform to normal curves of error, which is not, however, the 
most general condition. 

Let individual values of X and Y, x and y respectively, be 
grouped in pairs, as measurements of two quantities at the same 
date, or of two parts of the same organism, or in any other way. If 
X and J are quite independent, none of the causes producing them 
are common to both, and the ^s are independent of the c's in the 
above equations. Then ^, the chance of divergencies x and y 

concurring = ^ ^^ Y.e *^^ x (a constant). 

For any one value of ;r, the quantities y are grouped about 
the mean value Y, in accordance with the normal curve c^ ; and 
similarly for any one value of ^. 

The above equation may be written z=C.e ' ^^ . If we 
give z any definite value k, the a^s andys which have jointly the 
probability k^ are connected by the equation 

which is the equation of an ellipse having its principal axes 
coincident with the axes of measurement of x and y^ if we 
suppose X QXidy measured on two horizontal lines perpendicular 
to one another. Let z be measured vertically ; then in the 
surface given by the equation connecting ^, ;r, y all the hori- 
zontal sections are similar ellipses, whose projections on a 
horizontal plane are concentric and similarly situated,* while 
all the vertical sectiotis are normal curves of error.f 

This is the surface of no correlation. 

If, on the other hand, any of the e's coincide with any of the 
c's, it may be shown that a new term is introduced in the 
equation between ^, ;r, and y, which becomes 

n I . W c.c,+cj ;i-r« 

z = . — .g 

♦ For a diagram of this projection and for a general discussion of corre- 
lation on the same lines as this chapter, but more advanced and complete, 
see Mr Udny Yule's paper on The Theory of Correlation in the Statistical 
Journal^ Dec. 1897. 

/ x' mx-Hn *\ 

t For the section by a plane >' = wjr + « is sr = Qe'^'^x c,« /, ^hich 
may be written z = ^-a(x+B)« ^ j^ where A, B, and D are constants. 
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where n is the number of pairs of observations and r is a 
quantity we have still to decide ; this is the general equation 
of the normal correlation surface. The horizontal sections, 
obtained by giving z different constant values, are now of the 
form 

—J - 2r.— i + ^ « /, a quantity independent o{ x and^. 

The projections of the horizontal sections are still concentric, 
similar and similarly situated ellipses, but their principal axes 
are now inclined to the axes of x and j'. The vertical sections 
are still normal curves of error with various centres ; in particular 
the frequencies of the values of y found in conjunction with ^^ a 
particular value of jr, are given by the equation 

if = ^ Vcf c,c, c|;x-r« X constant. 
= ^-cj(«-r«)V^"'*^*V ^ constant. 

This is a normal curve of error with its centre at r.-?x,. 
Thus the mean vaiue of y corresponding to x^, a given value ofxy is 
r.^x,. 

These mean values all lie on the line ^ = r.- . 

Cg c^ 

Similarly the mean values Of x, corresponding to given values of 7, 
lie on the line — = r.^ . 

r is called the coefficient of correlation. If r is positive, for 

every given value of ;r, the mean value of the corresponding ys is 

The oo6ffiei«at positive and a definite fraction of x\ if r is negative, 

of oorreutKuL ^^^ correlation is said to be negative, and for every 

given value of or, the mean value of the corresponding ys is a 

definite negative fraction oi x. 

To determine the value of r, we must observe that this single 
quantity determines the shape of the whole surface, when N, c^, c^ 
are given, just as the modulus determines the shape of the curve 
of error. We decided the best value of the modulus* by con- 
sidering from what curve of error the observed values would 
arise with least improbability. Professor Pearson finds the value 

* See p. 283, supra. 
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of r by considering from what distribution of z^ ;r, and y (/.^., 
from what surface of correlation) the observed pairs would 

arise with least improbability ; r is thus found to be ^ ^' or 

i'^jf — .— j, the summation being extended over all pairs of 
Xy y^ where cr^, o-^ are the errors of mean square of the Ar's and 
/s respectively, and hence 0-^= — ^, o-g =— ^. 

But with other values of r the observed pairs might have 

been obtained with greater or less improbability, and these 

values are distributed in accordance with a normal curve of 

I -r^ 
error whose probable error is .67 — =- ; * that is, when from all 

the possible correlation surfaces, which might have resulted in the 
observed values, those whose correlation coefficients are within 

the limits r ± .67 — are selected, the sum of their pro- 

babilities is \, Lr- 

It will be useful to examine the limits of the possible values 
ofr. 

r always lies between + i and - i. 
For n^6\6\- (S^)^ = 2^*.2y - i^y)\ since o-j = y^, «r, = ^ ^ 

= Mj-! + ^5 + +) (j>i + >^ + •+) - (.Kyi + VI + +)'. 

[where {x^y^ (■^i>'i) • • • ^^e pairs of observations, and -i = A^, -3 = Xjj , . . j 

= ^.j;J(X,-X,)2+ + 
which is zero if Xj = Aj = X, = = , but otherwise positive. 
Hence n^<^^a\- i^yf is positive, 

X > {^\ 

I >^ 
and r is between + 1 and - i, except when Aj = Ag = X3 = ; in 

this case r = ± i, and the correlation is said to be perfect, positively 
or negatively. 

* See Pearson, loc, cit^ p. 226 ; Yule, loc, ciL^ p. 847 ; and Pearson, 
Proceedings of Royal Society^ Oct 1897. By a similar line of reasoning the 

probable error of ^ as determined on p. 283 is found to be .477—7=. 
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It may be noticed that on d priori grounds without any 

mathematical investigation the formula i (-i.-^ + ^.-^ + 4-^ 

gives a good measure of correlation. For if there is positive 
correlation, whenever we have a positive value of x we may 
expect a positive value of ^, and whenever we have a negative 
value of X we may expect a negative value of y, and each such 
term increases the coefficient ; while, if there is no correlation, 
for any value of x occurring several times, we may expect 
positive and negative values of y which on the whole give a 
very small sum. Meanwhile the denominators at once bring 
the deviations into relation with the mean deviations, and pre- 
vent the whole coefficient becoming greater than unity. 

We see then that r measures the correspondence between 

deviations from their means of the two series of observations. 

The mMrarMiiMit If the deviations are in exactly the same ratio for 

of oomutioiL .^11 pairs, the correlation is perfect, and r= i ; while 

r tends to zero when for a given deviation in one of the series 

we have excess and defect with equal frequency in the other. 

r serves as a measure of any statement involving two quali- 
fying adjectives, which can be measured numerically, such as 
"tall men have tall sons," "wet springs bring dry summers," 
" short hours go with high wages." 

1 -r^ 
When r is not greater than its probable error ,67 — =r- we 

have no evidence that there is any correlation, for the observed 
phenomena might easily arise from totally unconnected causes ; 
but when r is greater than, say, 6 times its probable error, we 
may be practically certain that the phenomena are not indepen- 
dent of each other, for the chance that the observed results would 
be obtained from unconnected causes is practically zero. 

The calculation of r is quite simple, and if we can assume 
normal dispersion, so that the probable error in a series is 
equal to .67 of the error of mean square,* can be performed 
very rapidly. In the following tables the correlation between 
the prices of wheat, foreign trade, and the marriage-rate, already 
discussed by the help of the graphic method, is investigated. 

* Hence <r = about \ of distance between cjuartiles» 
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Examples of Correlation. 







Marriagt 


'Rate and Price of 


Wheat 


' 


1845-64. 
















Ycanu ^^r^' 




Price of 
Wheat. 


Differences. 


Products 
of 


Imports 
sind £xports 




^* 






*. d. 




Differences. 


;Cmln. 


1845 


17.2 


+ 


.44 


50 10 


+ 20 


+ 8 


143 


1846 


17.2 


+ 


.44 


54 8 


+ 26 


+ II 


131 


1847 


15.8 


- 


.96 


69 9 


+ 207 


- 199 


142 


1848 


15.9 


- 


.86 


50 6 


- 24 


+ 21 


142 


1849 


16.2 


- 


.56 


44 3 


- 99 


+ 56 


162 


1850 


17.2 


+ 


.44 


40 3 


- 147 


- 65 


172 


185I 


17.2 


+ 


.44 


38 6 


- 168 


- 74 


185 


1852 


17.4 


+ 


.64 


40 9 


- 145 


- 92 


187 


1853 


17.9 


+ 


1. 14 


53 3 


+ 9 


+ 10 


222 


1854 


17.2 


+ 


.44 


72 5 


+ 239 


+ 105 


249 


1855 


16.2 


- 


.56 


74 8 


+ 262 


- 147 


239 


1856 


16.7 


- 


.06 


69 2 


+ 2CX) 


- 12 


288 


1857 


16.5 


- 


.26 


56.4 


+ 46 


- 12 


310 


1858 


16.0 


- 


.76 


44 2 


- 114 


+ 87 


281 


1859 


17.0 


+ 


.24 


43 9 


- 119 


- 29 


309 . 


i860 


17.I 


+ 


.34 


53 3 


+ 9 


+ 3 


346 


1861 


16.3 


- 


.46 


55 4 


+ 34 


- 16 


342 


1862 


16. 1 


- 


.66 


55 5 


+ 35 


- 23 


340 


1863 


16.8 


+ 


.64 


44 9 


- 93 


- 4 


396 


1864 


17.2 


+ 


.44 


40 2 


- 148 


- 63 


435 


Av 


16.76 




Av 


.52 6 


Sry 


= -445 





Correlation between marriage-rate and 
the price of wheat — 

0-, = .580 0-2 = 133 



-445 



20 X 133 X. 58 
Probable error of r = 



= - .29 
29 



Correlation between marriage-rate and 
imports and exports — 
o-i = .580 o-s = 90 
2xy = 8 - 
r = + .007 

Probable error of r = .15 
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1875.94. 




















1875 


16.7 


+ 


1.53 


45 2 


+ 


88 


+ 


135 


597 


1876 


16.5 


+ 


1.33 


46 2 


+ 


100 


+ 


133 


576 


1877 


15.7 


+ 


.53 


56 9 


+ 


227 


+ 


120 


593 


1878 


15.2 


+ 


.03 


46 5 


+ 


103 


+ 


3 


562 


1879 


14.4 


- 


.77 


43 10 


+ 


72 


— 


55 


554 


1880 


14.9 


- 


.27 


44 4 


+ 


78 


- 


21 


634 


I88I 


15. 1 


- 


.07 


45 4 


+ 


90 


- 


6 


631 


1882 


15.5 


+ 


.33 


45 I 


+ 


87 


+ 


29 


654 


1883 


155 


+ 


.33 


41 7 


+ 


45 


+ 


IS 


667 


1884 


15.1 


- 


.17 


35 8 


- 


26 


+ 


4 


623 


1885 


14.5 


- 


.67 


32 10 


- 


60 


+ 


40 


584 


1886 


14.2 


- 


.97 


31 


- 


82 


+ 


80 


563 


1887 


14.4 


- 


.77 


32 6 


- 


64 


+ 


49 


584 


1888 


14.4 


- 


.77 


31 10 


- 


72 


+ 


55 


622 


X889 


15.0 


- 


.17 


29 9 


- 


97 


+ 


16 


676 


1890 


15.5 


+ 


.33 


31 II 


- 


72 


— / 


23 


684 


I89I 


15.6 


+ 


.43 


37 


- 


10 


- 


4 


683 


1892 


15.4 


+ 


.23 


30 3 


- 


91 


- 


21 


651 


1893 


14.7 


- 


.47 


26 4 


- 


138 


+ 


65 


623 


1894 


15. 1 


- 


.07 


22 10 


- 


180 


+ 


13 


624 


Av 


15.17 




Av 


• 37 10 






2rr = 


627 





Correlation between marriage-rate and 
the price of wheat — 

o-j = .651 cri = 102 
2xy = 627 
[Or distance between quartiles = .9, 
whence cr^ = .67] 

627 



20 X 102 X. 651 
Probable error . i 



= +.47 



Correlation between marriage-rate and 
imports and exports — 

o-j = .651 o-j = 41 

r=+.25 

Probable error of r = .14 



Hence there was slight negative correlation between the 
marriage -rate and price of wheat before 1864, that is, the 
marriage-rate fell when wheat rose; but since 1864 there is 
better evidence that the marriage-rate rises when wheat rises. 
The marriage-rate and foreign trade were quite uncorrelated 
before 1864, and show only slight correlation at more recent 
dates ; the odds against the correspondence between the ob- 
served figures, since 1875, arising without causal connection are 
only about 4 to i, if we assume that the figures for each year 
are independent of the next. 

An earlier method of estimating correlation, introduced by 
Tha oaitonio Mr Galton,* is very useful for a rapid survey of 

method. |.^Q groups of figures. As a simple example 
adequately illustrating the method, we will take two series 

* See Proceedings of the Royal Society^ 1886, vol. xi., Family Likeness 
in Stature, 
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CORRELATION OF DAILY MAXIMA AND MINIMA OF 
TEMPERATURE IN 1898. 
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where the correlation is likely to be great, namely, the daily 
records of maxima and minima temperature recorded for 1898.* 

Wc first make a rapid survey of the series, and notice that 
the minima range from about 23° "^to 63°, and the maxima 
from about 35' to 95". Divide eaclt of these ranges into, say, 
10 equal parts, and draw up the framework of a table like 
the annexed. Turn through the records and enter the maxi- 
mum and minimum for each day by a dot in the appropriate 
place; thus on 4th October the maximum was 61.3*" and the 
minimum 51.3*; a dot should be put in the row 60** -64.9° under 
the heading sr-54.9'. When all the dots are entered, replace 
them by their number in each square. The table shows the 
result for 358 days. If there is correlation, it^will be found 
that the medians, or arithmetic averages, of each row form an 
orderly progression, and similarly for each column. These 
medians are roughly estimated and given in the table. 

To test the correlation of the minima relative to the maxima 
a diagram is drawn. Choose scales so that the distance between 
the quartiles of the maxima (18**) shall be represented by the 
same length vertically, as represents the distance between the 
quartiles of the minima (14') horizontally. Place crosses hori- 
zontally level with the middle points of the successive limits 




43 
Scale of Minima. 



* IVhi taker's Almanack^ 1899. 
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of maxima and vertically above the positions on the scale of the 
medians of the corresponding minima. 

Now draw two lines. The first through the positions of the 
quartiles and median (Q^, Qg, M); this is the line of perfect corre- 
lation, and with the scales we have chosen is at 45° to the hori- 
zontal ; draw another line through M, passing as near as possible 
to all the crosses. Draw any horizontal line PCN intersecting 
the former lines as in the figure. The ratio of CN to PN is the 
coeflScient of correlation. If the line CM passes through all the 
crosses and coincides with PM, the correlation is perfect. If 
CM is perpendicular to PM, there is perfect negative correlation. 
If CM is vertical there is no correlation. In the figure the ratio 
CN to PN is 4. A rough test of the presence of correlation is 
to be obtained by noticing whether all the crosses above the 
median are on one side of PM and below the median on the 
other side. 

There is a simple connection between the coefficient thus 

determined and that obtained by the previous formula. On 

Bauttion iMtw««B ^^^ diagram the scales are so chosen that we 

replace — , — by quantities ^, tj measured by equal 

units. Then if (fj ^J (f^ V2) - - - are the positions of the 358 
original pairs the line^ = r;r can be shown to be that whose 

mean distance from these points is a minimum when r— -^, its 

value previously given. It is easily seen that in the figure the 

CN 

ratio ~T is r^ if ^ = ^1 jr is the equation of a line through M 

referred to horizontal and vertical axes. Hence the line CM 

CN 2^71 
might be drawn from the original formula by taking p^ = ^^^; 

in other words, we have here a graphic method of finding the 
coefficient of correlation. 

Calculating r roughly from the data 0-^= 12.7, 0-2 = 9.1, « = 358, 

2^ = 32130, ^= 358x?2Tx9.i '"'^ approx.; that is, we obtain ap- 
proximately the same value by either method. 

Mr Galton applied this method to the question of inheritance 

of stature. He found that the correlation between the statures of 

children and of their parents was |. That is if a group 

of parents had an average stature x inches above (or 

below) the general average, the average for their sons was only ^x 
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inches above (or below) the general average. This return towards 
the average is called in biological language "regression," and hence 
the coefficient of correlation is often spoken of as the " coefficient 

of regression," and such an equation asj^ = r.-^jr is called the 

" equation of regression." In words this equation is : the ratio 
of the divergence of one quantity from its mean value to its 
standard deviation equals the ratio of the divergence of a cor- 
related quantity to its standard deviation, multiplied by the 
coefficient of regression. 

There is an intimate relation between the law of error and 
biological theory. The law of error and other cognate laws 
give algebraic expression to the universal ten- 
dency to variation, whether we are dealing with 
any part of the social organism to whose measurement we have 
in this book limited statistics, or with any measurable organ of 
an animal or vegetable. The law of heredity can be only tested 
numerically by the theory of correlation ; the effect of natural 
selection is easily considered with the help of the coefficient of 
regression. For if there is no selection, the distance from the 
general average of the mean stature of successive generations, 
descended from a group whose mean deviation was jr, will be 
rx, f^x . . . f^x if r remains unchanged, a series whose terms 
rapidly tend to zero. If on the other hand a selection is made 
in each generation of those above the average, the divergence 
can be preserved and intensified. The discussion of this point 
would lead us too far afield. 

In this Second Part we have only discussed the elements of 

the subject, the theorems and formulae which writers on statistics 

now assume. We have examined only the normal 

Oonolviion. 

curve of error, and have not touched the asym- 
metrical curve of error, or algebraic formulae arising from 
different hypotheses, or correlation between more than two 
variables. In the region to which we have confined ourselves, 
however, we have had to deal with arguments of the same 
nature as are to be met with in the higher paths of statistics. 
The great difficulty which the student of economics encounters 
when dealing with the theory of error is the apparent slightness 
of relation between this theory and the facts with which he 
deals. This slightness is only apparent ; it is because the 
theory has not, in the form he meets it, been carried far enough 
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to fit it to the very complex facts of human affairs that we do 
not get that exact correspondence we might desire. The 
theoretical distribution of error may be expected to underlie 
all phenomena, just as the attraction of gravity underlies the 
action of all machinery. We cannot explain the motion of 
machinery by gravity alone, we need to consider also other 
natural forces, not so easily measured as gravity ; but still less 
can we explain that action if we ignore the force of gravity. 
It is hoped that the short treatment here given of the 
elements of so important a subject may make smoother the 
approach to a field of investigation where there is great promise 
of harvest but where the reapers are as yet few. 

Note. — While this book has been in the Press, an article by Prof. Pearson 
has appeared in the Philosophical Magazine^ July 1900, violently criticising 
the method adopted by most of his predecessors, who have investigated the 
applicability of the Law of Error to Statistics, that is to say, the method of 
first deducing the equation from ^priori considerations, and then comparing 
the results with experiments. By means of a criterion of ** fitting," which 
should be carefully studied, he shows that the chances that the statistics, 
with which Airy and Merriman illustrate the theory, would have arisen 
from random sampling are only .01423 and .000,00155 respectively on their 
hypothesis, and deduces '* that the normal curve possesses no special fitness 
for describing errors or deviations such as arise either in observing practice 
or in nature." It is to be remarked on this, first that the investigation of 
two examples does not prove his case, secondly that his criticism does not 
apply to such curves as the asymmetrical curve of error treated exhaustively 
by Professor Edgeworth, and thirdly that the claim of the authors, whom 
he treats with such contempt, is not that the fit is exact, but "that the 
formula represents with all practicable accurctcy the observed frequency" 
(Airy, quoted by Pearson) or "that the agreement is very satisfactory^^ 
(Merriman) : thus the authors in question make no claim that the normal 
law is the complete explanation of the observed errors, but are satisfied with 
the approximation they found : it was not to be expected that the pioneers in 
the field should attain finality. By a similar process the law of gravitation 
might be treated with derision by criticising the experiments of an Attwood's 
machine, when the resistance of the air was not considered. Prof. Pearson 
has four constants in the curve by which he attains a close fit in his Illustra- 
tion IV., and by increasing the number of his constants might obtain an 
absolute fit. With those developments of the normal curve of error, which 
depend on hypotheses very similar to those used by the earlier writers (see 
p. 303, supray and Professor Edgeworth's recent contributions to the Statis- 
tical Journal), more constants are present, and there is every likelihood that 
equally close agreement may be found. The present author does not, however, 
wish to enter here into the controversy as to which is the best formula for 
classifying phenomena. His intention has been to follow in the beaten track, 
and there can be little doubt that the ordinary reader will prefer to find some 
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il priori justificaLiion for the unfamiliar theory that natural phenomena can be 
represented by the formulae of algebraic probability, pace the author of T/ie 
Grammar of Science^ though he may recognise that the ultimate justification 
for the theory must be experience. There is no suggestion in this book that 
the whole of nature can be measured by the foot-rule of the normal curve of 
error ; but yet that it may be a useful instrument has been shown by few 
people more conclusively than by Prof. Pearson himself. 

In the following list will be found those books and articles relating to the subject 
of Part II. of this book which are most accessible and likely tu be most useful to the 
English student. Further references to foreign authors and to earlier writers will of 
course be found in the works here mentioned : — 

TODHUNTER, I.— History of the Theory of Probability, Especially Arts. 993-1002. 
Encyclop/edia Britannica.— Article on Probability, 
Dictionary of Political Economy (Palgrave's).— Article on Zazc;<7/"iErrtfr. 
Galton, F. — Inquiries into Human Faculty and its Development, 

Natural Inheritance, 

Family Likeness in Stature, Proc. of Royal Soc., 1886, 1888. 

Merriman, m,— Method of Least Squares, 
Chauvenet, — Practiced and Spherical Astronomy y vol. ii., App. 
Edgeworth, Prof. F. Y.— In the London, Edinburgh and Dublin Philosophical 
Magazine and Journal of Science (formerly issued under other similar titles, 
and known as the Literary and Philosophical Magazine), 5th series. 

Vols. 21, 22, 23, 24, 25, 30. Various investigations and examples. 

Vols. 34, 35, 36. Correlation. 

Vol. 41. Asymmetrical law of error. 

In the Journal of the Royal StatisticcU Society ^ 1886 and Jubilee Volume. 

Methods of Statistics. 

1888 and 1890. Chance in competitive examinations. 

1893 and 1894. Correlation. 

1895. Recent contributions (Pearson's) to theory. 

1896, 1897, and 1898. Miscellaneous applications of the Calculus of Probabilities. 
1899 and 1900. Representation of Statistics by Mathematical Formulae. 

Report of Committee of British Association on Monetary Standard, 1888. 

Camb. Phil. Soc. Trans., 1885 and 1886. Merits of various means. 

[The exact titles of the above articles may be found from the indexes of the 

volumes mentioned.] 
Pearson, Prof. K.—The Chance of Death atul other Essays, 

The Grammar of Science f chaps, x., xi. 

Contributions to MathemcUical Theory of Evolution in Transactions of Royal 

Society, 1894, 1895, 1896, and St4t. Soc. Journal, 1896, 1897. 

Probable errors of frequency constants. Royal Soc. Trans., 1898. 

Criterion . . . of Deviations , , , in a Correlated System , . . Phil. Mag., 

July 1900. 
Venn, Dr J. — The Logic of Chatue, 

Nature and Use of Averages, Stat. Journal, 1891. 

Cambridge Anthropometry, Journal of Anthropological Institute, Nov. 1888. 
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Yule, \}, ^History of Paupirism. Stat. Jouniai, 1896. 

Theory of Correlation. Da 1897. 

Chains in Pauperism* . Do. 1899. 

Association of Attributes in Statistics, Royal Soc Trans., 190a 

BowLEY, A. Ln^Accuracy of an Average. Stat. Journal, 1897. 

Shrppard, W. F.—On the calculation of the Average Square. Stat. Journal, 1897. 

Use of Auxiliary Curves. Stat. Journal, 1900. 

Normal Correlation. Camb. Phil. Society, vol. xix. 

Normal Distribution and Correlation. Royal Soc Trans., 1898. 
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Accuracy, 199-214. 

Age, 29, 147, 251, 312. 

Agricultural Wages, 50-52, 97-103, 109, 

1 10, 115-118. 
Arithmetic Average or Mean, 107- 1 10, log^ 

125, 126, -128, 129, 130, 136, 221 ; 

error in, 204, 306. 
Average error, 28^^ 2g2. 
Average wage, 6, ii. 
Average : Precision of, 305-308. 
Averages, 7, 19, 89, 92, 95, 107-130, 130, 

133-HO, I43» 214, 264; see Artth- 

iiietic Average^ Median^ Mode, Weighted 

Average. 

Bertillon, Dr J., 17, 129, 130, 158. 
Bias, 118. 

Biassed errors, 209-214, 219. 
Bibliography : of Interpolation, 258 ; of 

Law of Error, 327. 
Binomial Expansion or Theorem, 265, 

272.277, 288, 291, 301-2, 306. 
Births, 287. 
Blank Forms, 18, 19, 26, 27, 30, 37, 42, 

46, 63 ; specimens of, 23, 35, 36, 45, 

48, 51, 52, 54-58, 65, 67, 69. 
Boole, 242, 247, 251. 
Booth, C., 9, 27, 32, 78-80, 123, 158, 

251- 
Bortkewitsch, Dr, 302. 

Cartograms, 156-158. 

Census: Population, 10, 11, 23-32, 63, 
78-81, 82, 99, 233. 

Census: Wage, 11, 12, 33-40, 63, 87, 
92-96, 114, 125, 233. 

Chance, 266, 267 ; see Probability, 

Changes in Wages, 54-58, 61, 97-103. 

Chauvenet, 303. 

Coefficient: of Correlation, j/i?; of Pro- 
bability, 599; of Regression, j^j". 

Coefficients: Statistical, 7^9, 130, 296, 
299. 

Collection of material, 17, 18. 

** Combinational " Groups, 299. 

Comparison : Accuracy of, 206, 212, 305. 

Comparisons of Series, 168-177, 192-194. 



Consumption : Index No. of, 228. 

Correlation, j/6, 317-326. 

Cotton: wages, 39, 95, 96, 114; trade, 

164-167. 
Curve of Error : see Error ^ Law of. 
Curves of Frequency, 30^^- 
Cycles of Trade, 153, 181. 

Darwin, G. H., 256. 

De Morgan, 242, 247. 

Deciles, 124, 125-128, 133, 136, 144. 
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